From owner-linux@cthulhu.engr.sgi.com  Tue Apr 30 16:07:21 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id QAA20910; Tue, 30 Apr 1996 16:07:21 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: by cthulhu.engr.sgi.com (950511.SGI.8.6.12.PATCH526/911001.SGI)
	for linux-list id QAA04279; Tue, 30 Apr 1996 16:07:17 -0700
Received: from sgi.sgi.com by cthulhu.engr.sgi.com via ESMTP (950511.SGI.8.6.12.PATCH526/911001.SGI)
	for <linux@engr.sgi.com> id QAA04273; Tue, 30 Apr 1996 16:07:15 -0700
Received: from iron.ingenia.com by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	for <linux@engr.sgi.com> id QAA27631; Tue, 30 Apr 1996 16:07:13 -0700
Received: (from shaver@localhost) by iron.ingenia.com (8.6.9/8.6.9) id TAA12827 for linux@engr.sgi.com; Tue, 30 Apr 1996 19:08:31 -0400
From: Mike Shaver <shaver@ingenia.com>
Message-Id: <199604302308.TAA12827@iron.ingenia.com>
Subject: Let's talk platforms...
To: linux@cthulhu.engr.sgi.com
Date: Tue, 30 Apr 1996 19:08:30 -0400 (EDT)
X-Mailer: ELM [version 2.4 PL25]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length:        905
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

We're getting a whole pile of SGI hardware in to `kick the tires' over
the next 6 months, and I'm wondering which of them I should claim as
my test box.

I can likely choose an Indy, an Indigo, possibly one of the Webforce
boxes (I assume it's similar hardware to the Indigo) and, if I beg and
plead and offer beer, a Challenge-S.

Has an initial port target been designated?

Mike
(having trouble changing the address subscribed to the list, sice
majordomo@cthulhu.engr doesn't seem to want to talk to me...)

-- 
#> Mike Shaver (shaver@ingenia.com)                                        <#
#> Technical specialist, pedant, packetsmith                               <#
#>                                     Ingenia Communications Corporations <#
#>                    Research, Development, Support and Sleep Deprivation <#
#>                         Packets crafted, bugs found, rebellions quelled <#

From owner-linux@cthulhu.engr.sgi.com  Tue Apr 30 16:11:30 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id QAA21397; Tue, 30 Apr 1996 16:11:30 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: by cthulhu.engr.sgi.com (950511.SGI.8.6.12.PATCH526/911001.SGI)
	for linux-list id QAA04672; Tue, 30 Apr 1996 16:11:22 -0700
Received: from ares.esd.sgi.com by cthulhu.engr.sgi.com via ESMTP (950511.SGI.8.6.12.PATCH526/911001.SGI)
	for <linux@cthulhu.engr.sgi.com> id QAA04664; Tue, 30 Apr 1996 16:11:21 -0700
Received: from fir.esd.sgi.com by ares.esd.sgi.com via ESMTP (951211.SGI.8.6.12.PATCH1042/950213.SGI.AUTOCF)
	 id QAA12090; Tue, 30 Apr 1996 16:11:20 -0700
Received: by fir.esd.sgi.com (940816.SGI.8.6.9/920502.SGI.AUTO)
	 id QAA15849; Tue, 30 Apr 1996 16:11:05 -0700
Date: Tue, 30 Apr 1996 16:11:05 -0700
From: wje@fir.esd.sgi.com (William J. Earl)
Message-Id: <199604302311.QAA15849@fir.esd.sgi.com>
To: Mike Shaver <shaver@ingenia.com>
Cc: linux@cthulhu.engr.sgi.com
Subject: Re: Let's talk platforms...
In-Reply-To: <199604302308.TAA12827@iron.ingenia.com>
References: <199604302308.TAA12827@iron.ingenia.com>
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

Mike Shaver writes:
 > We're getting a whole pile of SGI hardware in to `kick the tires' over
 > the next 6 months, and I'm wondering which of them I should claim as
 > my test box.
 > 
 > I can likely choose an Indy, an Indigo, possibly one of the Webforce
 > boxes (I assume it's similar hardware to the Indigo) and, if I beg and
 > plead and offer beer, a Challenge-S.
 > 
 > Has an initial port target been designated?
...

      We expect to do the initial port to an Indy with an R4600 or R5000 processor,
following up with other processors and platforms.  A Challenge S is essentially
the same as an Indy without a graphics board.

From owner-linux@cthulhu.engr.sgi.com  Tue Apr 30 17:28:30 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id RAA14940; Tue, 30 Apr 1996 17:28:30 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: by cthulhu.engr.sgi.com (950511.SGI.8.6.12.PATCH526/911001.SGI)
	for linux-list id RAA17798; Tue, 30 Apr 1996 17:28:25 -0700
Received: from info.engr.sgi.com by cthulhu.engr.sgi.com via ESMTP (950511.SGI.8.6.12.PATCH526/911001.SGI)
	for <linux@cthulhu.engr.sgi.com> id RAA17793; Tue, 30 Apr 1996 17:28:24 -0700
Received: from circle-slide.indianapolis.sgi.com by info.engr.sgi.com via ESMTP (950413.SGI.8.6.12/940406.SGI.AUTO)
	for <linux@info.engr.sgi.com> id RAA26759; Tue, 30 Apr 1996 17:28:22 -0700
Received: by circle-slide.indianapolis.sgi.com (950413.SGI.8.6.12/930416.SGI)
	for linux@info id TAA02672; Tue, 30 Apr 1996 19:28:03 -0500
From: jm@circle-slide.indianapolis.sgi.com (jon madison)
Message-Id: <9604301928.ZM2670@circle-slide.indianapolis.sgi.com>
Date: Tue, 30 Apr 1996 19:28:03 -0500
X-Face: wT@!QyzV&.Q}K8PKQ90246#h4)}^Q#u|m5{gyvLyz=XrhvSP3"77M:lY.RQJC*^K]"a]{v5jS/dP8t!$L.Q'\\u|Vx*7wGC`N!kB6iYX@d?}XQ97&OdU@LQKOrKFkGb'H&'I[jq_9Y-CsJqfd?EBS;;Js`b+n^t!UK0)h_aQb[U4,T#/t0!{C[=y]d<W4dj4t"ld]D.VF-;ZH1{)}(kp=O=+ifF.~rbNM&<y7InwkU+6L#
X-Mailer: Z-Mail (3.2.3 08feb96 MediaMail)
To: linux@info.engr.sgi.com
Subject: fvwm on divot
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

installed fvwm 1.24r on divot.engr

j.

-- 
jon madison, silicon graphics, inc. 
us: <URL: http://www.sgi.com/>
mailto:jm@sgi.com        
me: <URL: http://klingon.iupucs.iupui.edu/~jmadison/>

From owner-linux@cthulhu.engr.sgi.com  Tue Apr 30 18:31:32 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id SAA09680; Tue, 30 Apr 1996 18:31:32 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: by cthulhu.engr.sgi.com (950511.SGI.8.6.12.PATCH526/911001.SGI)
	for linux-list id SAA26901; Tue, 30 Apr 1996 18:31:28 -0700
Received: from neteng.engr.sgi.com by cthulhu.engr.sgi.com via ESMTP (950511.SGI.8.6.12.PATCH526/911001.SGI)
	for <linux@cthulhu.engr.sgi.com> id SAA26890; Tue, 30 Apr 1996 18:31:26 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id SAA09668 for <lmlinux@neteng.engr.sgi.com>; Tue, 30 Apr 1996 18:31:16 -0700
Received: from sgi.sgi.com by cthulhu.engr.sgi.com via ESMTP (950511.SGI.8.6.12.PATCH526/911001.SGI)
	for <lmlinux@neteng.engr.sgi.com> id SAA26820; Tue, 30 Apr 1996 18:31:15 -0700
Received: from caipfs.rutgers.edu by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	for <lmlinux@neteng.engr.sgi.com> id SAA16437; Tue, 30 Apr 1996 18:31:13 -0700
Received: from huahaga.rutgers.edu (huahaga.rutgers.edu [128.6.155.53]) by caipfs.rutgers.edu (8.6.9+bestmx+oldruq+newsunq+grosshack/8.6.9) with ESMTP id VAA01363 for <lmlinux@neteng.engr.sgi.com>; Tue, 30 Apr 1996 21:31:10 -0400
Received: (davem@localhost) by huahaga.rutgers.edu (8.6.9+bestmx+oldruq+newsunq+grosshack/8.6.9) id VAA05228; Tue, 30 Apr 1996 21:31:10 -0400
Date: Tue, 30 Apr 1996 21:31:10 -0400
Message-Id: <199605010131.VAA05228@huahaga.rutgers.edu>
From: "David S. Miller" <davem@caip.rutgers.edu>
To: lmlinux@neteng.engr.sgi.com
Subject: whee
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk


Syscalls are a bit faster, just started optimizing again

                    L M B E N C H  1 . 0   S U M M A R Y
                    ------------------------------------

            Processor, Processes - times in microseconds
            --------------------------------------------
Host                 OS  Mhz    Null    Null  Simple /bin/sh Mmap 2-proc 8-proc
                             Syscall Process Process Process  lat  ctxsw  ctxsw
--------- ------------- ---- ------- ------- ------- ------- ---- ------ ------
trombetas  Linux 1.3.90   50      17    9.3K   38.6K     58K  370     88    109
trombetas  Linux 1.3.97   50      14    8.9K   38.3K     56K  354     86    101
negro.rut SunOS 4.1.3_U   49     124   18.3K   63.9K    110K  470    152    262
geneva.ru     SunOS 5.5   50      31   33.7K  148.2K    274K  596    174    205

            *Local* Communication latencies in microseconds
            -----------------------------------------------
Host                 OS  Pipe       UDP    RPC/     TCP    RPC/
                                            UDP             TCP
--------- ------------- ------- ------- ------- ------- -------
trombetas  Linux 1.3.90     285    1028    1754    1368    2610
trombetas  Linux 1.3.97     300    1016    1752    1376    2598
negro.rut SunOS 4.1.3_U     890    1375    2287    1573    2804
geneva.ru     SunOS 5.5     530    1563    2080    1354    2398

            *Local* Communication bandwidths in megabytes/second
            ----------------------------------------------------
Host                 OS Pipe  TCP  File   Mmap  Bcopy  Bcopy  Mem   Mem
                                  reread reread (libc) (hand) read write
--------- ------------- ---- ---- ------ ------ ------ ------ ---- -----
trombetas  Linux 1.3.90    8  4.0   23.5   17.4     18     25   42    37
trombetas  Linux 1.3.97    8  4.0   23.5   17.4     18     25   41    37
negro.rut SunOS 4.1.3_U    4  2.0   19.5    8.2     18     24   41    36
geneva.ru     SunOS 5.5    8  7.0   12.6   19.5     18     18   40    36

            Memory latencies in nanoseconds
            (WARNING - may not be correct, check graphs)
            --------------------------------------------
Host                 OS   Mhz  L1 $   L2 $    Main mem    TLB    Guesses
--------- -------------   ---  ----   ----    --------    ---    -------
trombetas  Linux 1.3.90    50    20    170         180     -1    No L2 cache?
trombetas  Linux 1.3.97    50    20    170         180    659    No L2 cache?
negro.rut SunOS 4.1.3_U    49    20    175         183     -1    No L2 cache?
geneva.ru     SunOS 5.5    49     -      -           -      -    Bad mhz?

From owner-linux@cthulhu.engr.sgi.com  Tue Apr 30 19:33:40 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id TAA11656; Tue, 30 Apr 1996 19:33:39 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: by cthulhu.engr.sgi.com (950511.SGI.8.6.12.PATCH526/911001.SGI)
	for linux-list id TAA01966; Tue, 30 Apr 1996 19:33:33 -0700
Received: from neteng.engr.sgi.com by cthulhu.engr.sgi.com via ESMTP (950511.SGI.8.6.12.PATCH526/911001.SGI)
	for <linux@cthulhu.engr.sgi.com> id TAA01942; Tue, 30 Apr 1996 19:33:31 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id TAA11650 for <linux@neteng.engr.sgi.com>; Tue, 30 Apr 1996 19:33:30 -0700
Received: from sgi.sgi.com by cthulhu.engr.sgi.com via ESMTP (950511.SGI.8.6.12.PATCH526/911001.SGI)
	for <linux@neteng.engr.sgi.com> id TAA01926; Tue, 30 Apr 1996 19:33:28 -0700
Received: from informatik.uni-koblenz.de by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	for <linux@neteng.engr.sgi.com> id TAA22209; Tue, 30 Apr 1996 19:33:25 -0700
Received: from grass (grass.uni-koblenz.de [141.26.4.65]) by informatik.uni-koblenz.de (8.7.4/8.6.9) with SMTP id EAA26130; Wed, 1 May 1996 04:33:21 +0200 (MET DST)
From: Ralf Baechle <ralf@informatik.uni-koblenz.de>
Message-Id: <199605010233.EAA26130@informatik.uni-koblenz.de>
Received: by grass (5.x/KO-2.0)
	id AA01190; Wed, 1 May 1996 04:30:58 +0200
Subject: Re: scope of this mailing list
To: ewt@redhat.com (Erik Troan)
Date: Wed, 1 May 1996 04:30:58 +0200 (MET DST)
Cc: linux@neteng.engr.sgi.com
In-Reply-To: <Pine.LNX.3.91.960429200526.3781C-100000@redhat.com> from "Erik Troan" at Apr 29, 96 08:06:49 pm
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

Hi,

> > to get a Linux/MIPs distribution.  Furthermore, givem that Linux/MIPs
> > will run IRIX elf binaries, we might be able to merge the Freeware and
> > Linux/MIPs efforts - they have a lot of overlap.  Something to think 
> > about.
> 
> This raises a good question - what is the relationship between the SGI port,
> a port to Digital MIPS/TurboChannel machines, and the MIPS/PC port (that
> works on MIPS machines with PCI/EISA buses)? Will they all be the same
> endian? Should binarises be comaptible? What about sources such as libc
> and the kernel syscall interface?

The main issue in achieving binary compatibility accross all Linux/MIPS
targets is the byte order.  For some machines (Mips Magnum 4000, Olivetti
M700-10, SNI RM series and others more) the byte order for the kernel is
configurable.  For other it is fixed.  This is often the case for machines
that were built with NT in mind.

The MIPS architecture offers us the nice feature of switchable byteorder
for usermode.  Thus we have a way to run software from other systems with
differing native byte order.  In other words: it's technological possible
but it's not implemented yet.

The MIPS ABI which to support is one design goal for Linux/MIPS supports
only big endian systems while current Linux/MIPS implementations are all
little endian.  This single fact shows Linux/MIPS doesn't currently
conform to the ABI but it will be relativly easy to do so in the future.

The ABI explicitly forbids direct syscalls from the usercode into the
kernel.  Instead every program is supposed to be linked with the shared
library libc.so.1 which contains the actual interface to the kernel.
Linux/MIPS currently uses the GNU libc which is far being compliant
to the ABI.

Nevertheless Linux/MIPS contains an (currently on partial implemented)
syscall interface that provides not only the syscalls known from the
Linux/i386 implementation - it also features the same syscall conventions,
numbers and more as implemented in IRIX and other MIPS UNIX systems.
Call it a kludge but it can make things easier.

   Ralf

From owner-linux@cthulhu.engr.sgi.com  Tue Apr 30 22:36:46 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id WAA13818; Tue, 30 Apr 1996 22:36:45 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: by cthulhu.engr.sgi.com (950511.SGI.8.6.12.PATCH526/911001.SGI)
	for linux-list id WAA13641; Tue, 30 Apr 1996 22:35:11 -0700
Received: from info.engr.sgi.com by cthulhu.engr.sgi.com via ESMTP (950511.SGI.8.6.12.PATCH526/911001.SGI)
	for <linux@cthulhu.engr.sgi.com> id WAA13635; Tue, 30 Apr 1996 22:35:10 -0700
Received: from divot.engr.sgi.com by info.engr.sgi.com via ESMTP (950413.SGI.8.6.12/940406.SGI.AUTO)
	for <linux@info.engr.sgi.com> id WAA18872; Tue, 30 Apr 1996 22:35:01 -0700
Received: by divot.engr.sgi.com (950413.SGI.8.6.12/940406.SGI.AUTO)
	for linux@info id WAA14594; Tue, 30 Apr 1996 22:34:56 -0700
Date: Tue, 30 Apr 1996 22:34:56 -0700
From: root@divot.engr.sgi.com (Super-User)
Message-Id: <199605010534.WAA14594@divot.engr.sgi.com>
To: linux@info.engr.sgi.com
Subject: xfishtank
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

xfishtank on divot.engr:/usr/bin/X11/xfishtank

j.

From owner-linux@cthulhu.engr.sgi.com  Wed May  1 06:55:08 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id GAA20596; Wed, 1 May 1996 06:55:08 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: by cthulhu.engr.sgi.com (950511.SGI.8.6.12.PATCH526/911001.SGI)
	for linux-list id GAA05621; Wed, 1 May 1996 06:54:50 -0700
Received: from sgi.sgi.com by cthulhu.engr.sgi.com via ESMTP (950511.SGI.8.6.12.PATCH526/911001.SGI)
	for <linux@cthulhu.engr.sgi.com> id GAA05615; Wed, 1 May 1996 06:54:48 -0700
Received: from iron.ingenia.com by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	 id GAA05978; Wed, 1 May 1996 06:54:44 -0700
Received: (from shaver@localhost) by iron.ingenia.com (8.6.9/8.6.9) id JAA17982; Wed, 1 May 1996 09:56:05 -0400
From: Mike Shaver <shaver@ingenia.com>
Message-Id: <199605011356.JAA17982@iron.ingenia.com>
Subject: Re: Let's talk platforms...
To: wje@fir.esd.sgi.com (William J. Earl)
Date: Wed, 1 May 1996 09:56:04 -0400 (EDT)
Cc: linux@cthulhu.engr.sgi.com
In-Reply-To: <199604302311.QAA15849@fir.esd.sgi.com> from "William J. Earl" at Apr 30, 96 04:11:05 pm
X-Mailer: ELM [version 2.4 PL25]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length:        765
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

William J. Earl wrote:
>       We expect to do the initial port to an Indy with an R4600 or
> R5000 processor, following up with other processors and platforms.
> A Challenge S is essentially the same as an Indy without a graphics
> board.

Cool.
That's been passed on to the salescritter as a desirable item, because
of the Linux project.

We'll see how it goes.

Mike

-- 
#> Mike Shaver (shaver@ingenia.com)                                        <#
#> Technical specialist, pedant, packetsmith                               <#
#>                                     Ingenia Communications Corporations <#
#>                    Research, Development, Support and Sleep Deprivation <#
#>                         Packets crafted, bugs found, rebellions quelled <#

From owner-linux@cthulhu.engr.sgi.com  Wed May  1 08:45:32 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id IAA27329; Wed, 1 May 1996 08:45:32 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: by cthulhu.engr.sgi.com (950511.SGI.8.6.12.PATCH526/911001.SGI)
	for linux-list id IAA13207; Wed, 1 May 1996 08:45:27 -0700
Received: from neteng.engr.sgi.com by cthulhu.engr.sgi.com via ESMTP (950511.SGI.8.6.12.PATCH526/911001.SGI)
	for <linux@cthulhu.engr.sgi.com> id IAA13199; Wed, 1 May 1996 08:45:26 -0700
Received: from ares.esd.sgi.com (fddi-ares.engr.sgi.com [192.26.80.60]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id IAA27314 for <linux@neteng.engr.sgi.com>; Wed, 1 May 1996 08:45:25 -0700
Received: from fir.esd.sgi.com by ares.esd.sgi.com via ESMTP (951211.SGI.8.6.12.PATCH1042/950213.SGI.AUTOCF)
	 id IAA17255; Wed, 1 May 1996 08:45:24 -0700
Received: by fir.esd.sgi.com (940816.SGI.8.6.9/920502.SGI.AUTO)
	 id IAA07701; Wed, 1 May 1996 08:44:55 -0700
Date: Wed, 1 May 1996 08:44:55 -0700
From: wje@fir.esd.sgi.com (William J. Earl)
Message-Id: <199605011544.IAA07701@fir.esd.sgi.com>
To: Ralf Baechle <ralf@informatik.uni-koblenz.de>
Cc: ewt@redhat.com (Erik Troan), linux@neteng.engr.sgi.com
Subject: Re: scope of this mailing list
In-Reply-To: <199605010233.EAA26130@informatik.uni-koblenz.de>
References: <Pine.LNX.3.91.960429200526.3781C-100000@redhat.com>
	<199605010233.EAA26130@informatik.uni-koblenz.de>
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

Ralf Baechle writes:
...
 > The main issue in achieving binary compatibility accross all Linux/MIPS
 > targets is the byte order.  For some machines (Mips Magnum 4000, Olivetti
 > M700-10, SNI RM series and others more) the byte order for the kernel is
 > configurable.  For other it is fixed.  This is often the case for machines
 > that were built with NT in mind.
 > 
 > The MIPS architecture offers us the nice feature of switchable byteorder
 > for usermode.  Thus we have a way to run software from other systems with
 > differing native byte order.  In other words: it's technological possible
 > but it's not implemented yet.
...

     I once worked on this problem on another OS base.  The basic system
calls are easy.  ioctls, especially for streams, were much harder.  Within
the limits of the published ABI, as opposed to the universe of working programs,
the task is not too difficult.

From owner-linux@cthulhu.engr.sgi.com  Thu May  2 20:31:35 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id UAA27055; Thu, 2 May 1996 20:31:35 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: by cthulhu.engr.sgi.com (950511.SGI.8.6.12.PATCH526/911001.SGI)
	for linux-list id UAA05379; Thu, 2 May 1996 20:30:25 -0700
Received: from neteng.engr.sgi.com by cthulhu.engr.sgi.com via ESMTP (950511.SGI.8.6.12.PATCH526/911001.SGI)
	for <linux@cthulhu.engr.sgi.com> id UAA05374; Thu, 2 May 1996 20:30:24 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id UAA27044 for <lmlinux@neteng.engr.sgi.com>; Thu, 2 May 1996 20:30:08 -0700
Received: from sgi.sgi.com by cthulhu.engr.sgi.com via ESMTP (950511.SGI.8.6.12.PATCH526/911001.SGI)
	for <lmlinux@neteng.engr.sgi.com> id UAA05310; Thu, 2 May 1996 20:30:00 -0700
Received: from caipfs.rutgers.edu by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	for <lmlinux@neteng.engr.sgi.com> id UAA02709; Thu, 2 May 1996 20:29:53 -0700
Received: from huahaga.rutgers.edu (huahaga.rutgers.edu [128.6.155.53]) by caipfs.rutgers.edu (8.6.9+bestmx+oldruq+newsunq+grosshack/8.6.9) with ESMTP id XAA03769 for <lmlinux@neteng.engr.sgi.com>; Thu, 2 May 1996 23:29:45 -0400
Received: (davem@localhost) by huahaga.rutgers.edu (8.6.9+bestmx+oldruq+newsunq+grosshack/8.6.9) id XAA09463; Thu, 2 May 1996 23:29:45 -0400
Date: Thu, 2 May 1996 23:29:45 -0400
Message-Id: <199605030329.XAA09463@huahaga.rutgers.edu>
From: "David S. Miller" <davem@caip.rutgers.edu>
To: lmlinux@neteng.engr.sgi.com
Subject: add a 'make -j vmlinux' for flavor...
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk


SparcClassic, 24mb of ram, swapping ferociously...
Can anyone say "quality assurance"?

USER       PID %CPU %MEM  SIZE   RSS TTY STAT START   TIME COMMAND
davem      202  0.0  0.2   148    64  ?  S    02:45   0:00 ./crashme +2000 666 100 1:10:30 2 
davem      203  0.0  0.2   148    64  ?  S    02:45   0:00 ./crashme +2000 666 100 1:10:30 2 
davem      206  0.0  0.2   148    64  ?  S    02:45   0:00 ./crashme +2000 666 100 1:10:30 2 
davem      210  0.0  0.2   148    64  ?  S    02:45   0:00 ./crashme +2000 666 100 1:10:30 2 
davem      279  3.0  0.2   344    56  ?  R    02:46   1:09 ./crashme +2000 686 100 21 2 subprocess 
davem      281  3.0  0.2   344    56  ?  R    02:46   1:09 ./crashme +2000 686 100 21 2 subprocess 
davem      288  3.0  0.2   344    52  ?  R    02:46   1:08 ./crashme +2000 686 100 21 2 subprocess 
davem      291  3.0  0.2   344    52  ?  R    02:46   1:08 ./crashme +2000 686 100 21 2 subprocess 
davem      428  1.9  0.2   232    56  ?  R    02:49   0:40 ./crashme +2000 721 100 56 2 subprocess 
davem      429  1.9  0.2   228    56  ?  R    02:49   0:40 ./crashme +2000 721 100 56 2 subprocess 
davem      430  1.9  0.2   232    56  ?  R    02:49   0:40 ./crashme +2000 721 100 56 2 subprocess 
davem      431  1.9  0.2   232    56  ?  R    02:49   0:40 ./crashme +2000 721 100 56 2 subprocess 
davem      508  1.6  0.2   256    52  ?  R    02:50   0:32 ./crashme +2000 741 100 76 2 subprocess 
davem      509  1.6  0.2   252    52  ?  R    02:50   0:32 ./crashme +2000 741 100 76 2 subprocess 
davem      510  1.6  0.2   252    52  ?  R    02:50   0:32 ./crashme +2000 741 100 76 2 subprocess 
davem      511  1.6  0.2   256    52  ?  R    02:50   0:32 ./crashme +2000 741 100 76 2 subprocess 
davem      544  1.5  0.2   348    52  ?  R    02:51   0:30 ./crashme +2000 750 100 85 2 subprocess 
davem      545  1.5  0.2   348    52  ?  R    02:51   0:30 ./crashme +2000 750 100 85 2 subprocess 
davem      546  1.5  0.2   348    52  ?  R    02:51   0:30 ./crashme +2000 750 100 85 2 subprocess 
davem      547  1.5  0.2   344    52  ?  R    02:51   0:30 ./crashme +2000 750 100 85 2 subprocess 
davem      580  1.4  0.2   172    52  ?  R    02:52   0:28 ./crashme +2000 759 100 94 2 subprocess 
davem      581  1.4  0.2   176    52  ?  R    02:52   0:28 ./crashme +2000 759 100 94 2 subprocess 
davem      582  1.4  0.2   176    52  ?  R    02:52   0:28 ./crashme +2000 759 100 94 2 subprocess 
davem      583  1.4  0.2   172    52  ?  R    02:52   0:28 ./crashme +2000 759 100 94 2 subprocess 
davem      644  1.3  0.2   256    52  ?  R    02:53   0:24 ./crashme +2000 775 100 110 2 subprocess 
davem      645  1.3  0.2   256    52  ?  R    02:53   0:24 ./crashme +2000 775 100 110 2 subprocess 
davem      646  1.3  0.2   256    52  ?  R    02:53   0:24 ./crashme +2000 775 100 110 2 subprocess 
davem      647  1.3  0.2   256    52  ?  R    02:53   0:24 ./crashme +2000 775 100 110 2 subprocess 
davem      680  1.3  0.2   252    52  ?  R    02:54   0:23 ./crashme +2000 784 100 119 2 subprocess 
davem      681  1.3  0.2   252    52  ?  R    02:54   0:23 ./crashme +2000 784 100 119 2 subprocess 
davem      682  1.3  0.2   252    52  ?  R    02:54   0:23 ./crashme +2000 784 100 119 2 subprocess 
davem      683  1.3  0.2   252    52  ?  R    02:54   0:23 ./crashme +2000 784 100 119 2 subprocess 
davem      684  1.3  0.2   284    52  ?  R    02:54   0:23 ./crashme +2000 785 100 120 2 subprocess 
davem      685  1.3  0.2   284    52  ?  R    02:54   0:23 ./crashme +2000 785 100 120 2 subprocess 
davem      686  1.3  0.2   284    52  ?  R    02:54   0:23 ./crashme +2000 785 100 120 2 subprocess 
davem      687  1.3  0.2   284    52  ?  R    02:54   0:23 ./crashme +2000 785 100 120 2 subprocess 
davem      712  1.3  0.2   192    52  ?  R    02:55   0:22 ./crashme +2000 792 100 127 2 subprocess 
davem      713  1.3  0.2   196    52  ?  R    02:55   0:22 ./crashme +2000 792 100 127 2 subprocess 
davem      714  1.3  0.2   192    52  ?  R    02:55   0:22 ./crashme +2000 792 100 127 2 subprocess 
davem      716  1.3  0.2   192    52  ?  R    02:55   0:22 ./crashme +2000 792 100 127 2 subprocess 
davem      821  1.2  0.2   348    52  ?  R    02:57   0:19 ./crashme +2000 819 100 154 2 subprocess 
davem      822  1.2  0.2   348    52  ?  R    02:57   0:19 ./crashme +2000 819 100 154 2 subprocess 
davem      876  1.2  0.2   220    56  ?  R    02:59   0:18 ./crashme +2000 833 100 168 2 subprocess 
davem      877  1.2  0.2   220    56  ?  R    02:59   0:18 ./crashme +2000 833 100 168 2 subprocess 
davem      878  1.2  0.2   220    56  ?  R    02:59   0:18 ./crashme +2000 833 100 168 2 subprocess 
davem      880  1.2  0.2   216    56  ?  R    02:59   0:18 ./crashme +2000 833 100 168 2 subprocess 
davem     1029  1.1  0.2   308    52  ?  R    03:02   0:15 ./crashme +2000 871 100 206 2 subprocess 
davem     1030  1.1  0.2   316    52  ?  R    03:02   0:15 ./crashme +2000 871 100 206 2 subprocess 
davem     1092  1.1  0.2   256    52  ?  R    03:03   0:13 ./crashme +2000 887 100 222 2 subprocess 
davem     1094  1.1  0.2   256    52  ?  R    03:03   0:13 ./crashme +2000 887 100 222 2 subprocess 
davem     1095  1.1  0.2   256    52  ?  R    03:03   0:13 ./crashme +2000 887 100 222 2 subprocess 
davem     1096  1.1  0.2   256    52  ?  R    03:03   0:13 ./crashme +2000 887 100 222 2 subprocess 
davem     1104  1.1  0.2   348    52  ?  R    03:04   0:13 ./crashme +2000 889 100 224 2 subprocess 
davem     1131  1.1  0.1   344    36  ?  R    03:04   0:13 ./crashme +2000 896 100 231 2 subprocess 
davem     1239  1.1  0.1   228    40  ?  R    03:07   0:11 ./crashme +2000 923 100 258 2 subprocess 
davem     1240  1.1  0.2   224    52  ?  R    03:07   0:11 ./crashme +2000 923 100 258 2 subprocess 
davem     1300  1.1  0.1   208    36  ?  R    03:09   0:10 ./crashme +2000 939 100 274 2 subprocess 
davem     1302  1.1  0.1   204    36  ?  R    03:09   0:10 ./crashme +2000 939 100 274 2 subprocess 
davem     1303  1.1  0.2   208    52  ?  R    03:09   0:10 ./crashme +2000 939 100 274 2 subprocess 
davem     1306  1.1  0.2   204    52  ?  R    03:09   0:10 ./crashme +2000 939 100 274 2 subprocess 
davem     1312  1.1  0.1   348    44  ?  R    03:09   0:10 ./crashme +2000 942 100 277 2 subprocess 
davem     1316  1.1  0.2   348    52  ?  R    03:09   0:10 ./crashme +2000 942 100 277 2 subprocess 
davem     1406  1.1  0.2   240    52  ?  R    03:11   0:08 ./crashme +2000 965 100 300 2 subprocess 
davem     1450  1.1  0.1   344    40  ?  R    03:12   0:07 ./crashme +2000 975 100 310 2 subprocess 
davem     1487  1.1  0.1   348    40  ?  R    03:13   0:07 ./crashme +2000 985 100 320 2 subprocess 
davem     1513  1.1  0.1   228    40  ?  R    03:13   0:06 ./crashme +2000 992 100 327 2 subprocess 
davem     1514  1.1  0.1   232    40  ?  R    03:13   0:06 ./crashme +2000 992 100 327 2 subprocess 
davem     1568  1.1  0.2   348    52  ?  R    03:15   0:06 ./crashme +2000 1006 100 341 2 subprocess 
davem     1571  1.1  0.2   348    52  ?  R    03:15   0:06 ./crashme +2000 1006 100 341 2 subprocess 
davem     1597  1.1  0.2   328    52  ?  R    03:15   0:05 ./crashme +2000 1012 100 347 2 subprocess 
davem     1608  1.1  0.2   332    52  ?  R    03:15   0:05 ./crashme +2000 1015 100 350 2 subprocess 
davem     1649  1.0  0.2  3960    52  ?  R    03:16   0:04 ./crashme +2000 1026 100 361 2 subprocess 
davem     1683  1.1  0.2   204    56  ?  R    03:17   0:04 ./crashme +2000 1035 100 370 2 subprocess 
davem     1685  1.2  0.2   200    52  ?  R    03:17   0:04 ./crashme +2000 1035 100 370 2 subprocess 
davem     1687  1.1  0.2   200    52  ?  R    03:17   0:04 ./crashme +2000 1035 100 370 2 subprocess 
davem     1690  1.1  0.2   204    64  ?  R    03:17   0:04 ./crashme +2000 1035 100 370 2 subprocess 
davem     1816  0.2  1.5   584   344  p0 S    03:20   0:00 -bash 
davem     1834  1.2  0.1   192    44  ?  R    03:20   0:02 ./crashme +2000 1072 100 407 2 subprocess 
davem     1835  1.1  0.2   192    52  ?  R    03:20   0:02 ./crashme +2000 1072 100 407 2 subprocess 
davem     1838  1.3  0.2   192    52  ?  R    03:20   0:02 ./crashme +2000 1072 100 407 2 subprocess 
davem     1840  1.1  0.3   192    68  ?  R    03:20   0:02 ./crashme +2000 1072 100 407 2 subprocess 
davem     1844  1.3  0.3   344    72  ?  R    03:20   0:02 ./crashme +2000 1073 100 408 2 subprocess 
davem     1858  1.2  0.2   256    60  ?  R    03:21   0:02 ./crashme +2000 1077 100 412 2 subprocess 
davem     1917  1.7  0.7   348   176  ?  R    03:22   0:01 ./crashme +2000 1089 100 424 2 subprocess 
davem     1958  1.9  1.0   312   240  p0 R    03:22   0:01 ps -auxwww 
davem     1977 24.1  1.0   344   240  ?  R    03:23   0:01 ./crashme +2000 1103 100 438 2 subprocess 
davem     2023 99.9  1.0   312   228  ?  R    03:24   0:00 ./crashme +2000 1115 100 450 2 subprocess 
davem     2028 99.9  1.0  2248   224  ?  R    03:24   0:00 ./crashme +2000 1117 100 452 2 subprocess 
davem     2039 99.9  0.6   220   140  ?  R    03:25   0:00 ./crashme +2000 1119 100 454 2 subprocess 
davem     2040 99.9  1.0   308   228  ?  R    03:25   0:00 ./crashme +2000 1120 100 455 2 subprocess 
davem     2041 99.9  1.0   320   240  ?  R    03:25   0:00 ./crashme +2000 1120 100 455 2 subprocess 
davem     2042 99.9  1.1   332   248  ?  R    03:25   0:00 ./crashme +2000 1119 100 454 2 subprocess 
davem     2043 99.9  0.7   240   156  ?  R    03:25   0:00 ./crashme +2000 1120 100 455 2 subprocess 
davem     2046 99.9  0.9   296   216  ?  R    03:25   0:00 ./crashme +2000 1120 100 455 2 subprocess 
davem     2049 99.9  1.0   308   224  ?  R    03:25   0:00 ./crashme +2000 1122 100 457 2 subprocess 
davem     2051 99.9  0.9   300   220  ?  R    03:25   0:00 ./crashme +2000 1122 100 457 2 subprocess 
davem     2053 99.9  1.0   308   224  ?  R    03:25   0:00 ./crashme +2000 1123 100 458 2 subprocess 
davem     2055 99.9  1.0   308   232  ?  R    03:25   0:00 ./crashme +2000 1123 100 458 2 subprocess 
davem     2058 99.9  0.9   304   220  ?  R    03:25   0:00 ./crashme +2000 1123 100 458 2 subprocess 
davem     2061 99.9  1.0   304   224  ?  R    03:25   0:00 ./crashme +2000 1125 100 460 2 subprocess 
davem     2063  0.0  0.2   148    60  ?  R    03:25   0:00 ./crashme +2000 666 100 1:10:30 2 
davem     2064  0.0  0.2   148    56  ?  R    03:25   0:00 ./crashme +2000 666 100 1:10:30 2 
davem     2065  0.0  0.2   148    56  ?  R    03:25   0:00 ./crashme +2000 1125 100 460 2 subprocess 
davem     2066 99.9  0.6   220   144  ?  R    03:25   0:00 ./crashme +2000 1126 100 461 2 subprocess 
davem     2067 99.9  0.6   224   144  ?  R    03:25   0:00 ./crashme +2000 1126 100 461 2 subprocess 
davem     2068 99.9  0.6   224   144  ?  R    03:25   0:00 ./crashme +2000 1127 100 462 2 subprocess 
davem     2069 99.9  0.6   224   144  ?  R    03:25   0:00 ./crashme +2000 1127 100 462 2 subprocess 
davem     2070 99.9  0.6   220   140  ?  R    03:25   0:00 ./crashme +2000 1126 100 461 2 subprocess 
davem     2071 99.9  0.6   220   144  ?  R    03:25   0:00 ./crashme +2000 1127 100 462 2 subprocess 
davem     2074 99.9  0.6   216   140  ?  R    03:25   0:00 ./crashme +2000 1127 100 462 2 subprocess 
davem     2086  0.0  0.2   148    64  ?  R    03:26   0:00 ./crashme +2000 666 100 1:10:30 2 
davem     2087  0.0  0.2   148    64  ?  R    03:26   0:00 ./crashme +2000 666 100 1:10:30 2 
davem     2088  0.0  0.2   148    64  ?  R    03:26   0:00 ./crashme +2000 666 100 1:10:30 2 
davem     2089  0.0  0.2   148    64  ?  R    03:26   0:00 ./crashme +2000 666 100 1:10:30 2 
davem     2090  0.0  0.2   148    64  ?  R    03:26   0:00 ./crashme +2000 666 100 1:10:30 2 
root         1  0.0  0.0   188     0  ?  SW   02:43   0:00 (init)
root         2  0.0  0.0     0     0  ?  SW   02:43   0:00 (kflushd)
root         3  0.0  0.0     0     0  ?  SW<  02:43   0:01 (kswapd)
root         4  0.0  0.0     0     0  ?  SW   02:43   0:00 (nfsiod)
root         5  0.0  0.0     0     0  ?  SW   02:43   0:00 (nfsiod)
root         6  0.0  0.0     0     0  ?  SW   02:43   0:00 (nfsiod)
root         7  0.0  0.0     0     0  ?  SW   02:43   0:00 (nfsiod)
root        11  0.0  0.1   116    28  ?  S    02:43   0:00 update (bdfluHOME=/ 
root        27  0.0  0.0   628     4  ?  S    02:44   0:00 (inetd)
root        29  0.0  0.0   636     0  ?  SW   02:44   0:00 (portmap)
root        32  0.0  0.3   272    72  ?  S    02:44   0:00 /usr/sbin/syslogd 
root        34  0.0  0.0   308     0  ?  SW   02:44   0:00 (klogd)
root        47  0.0  0.0   152     0   1 SW   02:44   0:00 (getty)
root        48  0.0  0.0   152     0   2 SW   02:44   0:00 (getty)
root        49  0.0  0.0   152     0   3 SW   02:44   0:00 (getty)
root        50  0.0  0.0   152     0   4 SW   02:44   0:00 (getty)
root        51  0.0  0.0   152     0   5 SW   02:44   0:00 (getty)
root        52  0.0  0.0   152     0   6 SW   02:44   0:00 (getty)
root        84  0.0  0.0   540     0  ?  SWN  02:44   0:00 (punish.sh)
root        99  0.0  0.0   892     0  ?  SWN  02:44   0:01 (make)
root       105  0.0  0.0   660     0  ?  SWN  02:44   0:00 (gcc)
root       107  0.0  0.0   552     0  ?  SWN  02:44   0:00 (bash)
root       109  0.0  0.0   864     0  ?  SWN  02:44   0:00 (make)
root       119  0.0  0.0  1416     0  ?  SWN  02:44   0:01 (cpp)
root       120  0.0  2.8  2776   620  ?  R N  02:44   0:01 /usr/local/gnu/lib/gcc-lib/sparc-sun-sunos4.1.4/2.6.3/cc1 -quiet -dumpbase main.c -O2 -Wall -Wstrict-prototypes -fomit-frame-pointer -fno-strength-reduce -o - 
root       121  0.0  0.0  1128     0  ?  SWN  02:44   0:00 (as)
root       124  0.0  0.0   952     0  ?  SWN  02:44   0:01 (make)
root       137  0.0  0.0   660     0  ?  SWN  02:44   0:00 (gcc)
root       138  0.0  0.0   660     0  ?  SWN  02:44   0:00 (gcc)
root       139  0.0  0.0   660     0  ?  SWN  02:44   0:00 (gcc)
root       140  0.0  0.0  1288     0  ?  SWN  02:44   0:01 (cpp)
root       141  0.0  1.8  3088   408  ?  R N  02:44   0:02 /usr/local/gnu/lib/gcc-lib/sparc-sun-sunos4.1.4/2.6.3/cc1 -quiet -dumpbase sched.c -O2 -Wall -Wstrict-prototypes -fomit-frame-pointer -fno-strength-reduce -fno-omit-frame-pointer -o - 
root       142  0.0  0.0  1128     0  ?  SWN  02:44   0:00 (as)
root       144  0.0  1.9  2612   440  ?  R N  02:44   0:00 /usr/local/gnu/lib/gcc-lib/sparc-sun-sunos4.1.4/2.6.3/cc1 -quiet -dumpbase dma.c -O2 -Wall -Wstrict-prototypes -fomit-frame-pointer -fno-strength-reduce -o - 
root       145  0.0  0.0   660     0  ?  SWN  02:44   0:00 (gcc)
root       146  0.0  0.0  1128     0  ?  SWN  02:44   0:00 (as)
root       147  0.0  0.0   660     0  ?  SWN  02:44   0:00 (gcc)
root       149  0.0  0.0  1216     0  ?  SWN  02:44   0:01 (cpp)
root       150  0.0  0.0   660     0  ?  SWN  02:44   0:00 (gcc)
root       151  0.0  0.4  3188    92  ?  R N  02:44   0:02 (cc1)
root       152  0.0  0.0  1128     0  ?  SWN  02:44   0:00 (as)
root       153  0.0  0.0  1232     0  ?  SWN  02:44   0:00 (cpp)
root       154  0.0  0.9  3216   208  ?  R N  02:44   0:02 /usr/local/gnu/lib/gcc-lib/sparc-sun-sunos4.1.4/2.6.3/cc1 -quiet -dumpbase exec_domain.c -O2 -Wall -Wstrict-prototypes -fomit-frame-pointer -fno-strength-reduce -o - 
root       155  0.0  0.0  1236     0  ?  SWN  02:44   0:00 (cpp)
root       156  0.0  0.1  3212    40  ?  R N  02:44   0:02 (cc1)
root       157  0.0  0.0  1128     0  ?  SWN  02:44   0:00 (as)
root       158  0.0  0.0   660     0  ?  SWN  02:44   0:00 (gcc)
root       159  0.0  0.0  1128     0  ?  SWN  02:44   0:00 (as)
root       160  0.0  0.0  1144     0  ?  SWN  02:44   0:00 (cpp)
root       161  0.0  0.0   660     0  ?  SWN  02:44   0:00 (gcc)
root       162  0.1  0.4  3264   108  ?  R N  02:44   0:02 (cc1)
root       163  0.0  0.0  1128     0  ?  SWN  02:44   0:00 (as)
root       164  0.0  0.0   660     0  ?  SWN  02:44   0:00 (gcc)
root       165  0.0  0.0   660     0  ?  SWN  02:44   0:00 (gcc)
root       166  0.0  0.0  1420     0  ?  SWN  02:44   0:01 (cpp)
root       167  0.0  3.6  3100   796  ?  R N  02:44   0:01 /usr/local/gnu/lib/gcc-lib/sparc-sun-sunos4.1.4/2.6.3/cc1 -quiet -dumpbase sys.c -O2 -Wall -Wstrict-prototypes -fomit-frame-pointer -fno-strength-reduce -o - 
root       168  0.0  0.0  1128     0  ?  SWN  02:44   0:00 (as)
root       169  0.0  0.0   660     0  ?  SWN  02:44   0:00 (gcc)
root       170  0.0  0.0   660     0  ?  SWN  02:44   0:00 (gcc)
root       171  0.0  0.0  1368     0  ?  SWN  02:44   0:01 (cpp)
root       172  0.0  0.4  3088   100  ?  R N  02:44   0:01 (cc1)
root       173  0.0  0.0  1128     0  ?  SWN  02:44   0:00 (as)
root       174  0.0  0.0  1260     0  ?  SWN  02:44   0:00 (cpp)
root       175  0.0  0.0  1196     0  ?  SWN  02:44   0:01 (cpp)
root       176  0.0  0.0   660     0  ?  SWN  02:44   0:00 (gcc)
root       177  0.1  1.7  3260   396  ?  R N  02:44   0:02 /usr/local/gnu/lib/gcc-lib/sparc-sun-sunos4.1.4/2.6.3/cc1 -quiet -dumpbase exit.c -O2 -Wall -Wstrict-prototypes -fomit-frame-pointer -fno-strength-reduce -o - 
root       178  0.0  0.0  1128     0  ?  SWN  02:44   0:00 (as)
root       179  0.0  0.0  1252     0  ?  SWN  02:44   0:00 (cpp)
root       180  0.1  1.3  3264   292  ?  R N  02:44   0:02 /usr/local/gnu/lib/gcc-lib/sparc-sun-sunos4.1.4/2.6.3/cc1 -quiet -dumpbase signal.c -O2 -Wall -Wstrict-prototypes -fomit-frame-pointer -fno-strength-reduce -o - 
root       181  0.0  0.0  1128     0  ?  SWN  02:44   0:00 (as)
root       182  0.1  0.7  3272   172  ?  R N  02:44   0:02 (cc1)
root       183  0.0  0.0  1128     0  ?  SWN  02:44   0:00 (as)
root       184  0.0  0.0   660     0  ?  SWN  02:44   0:00 (gcc)
root       185  0.0  0.0  1236     0  ?  SWN  02:44   0:01 (cpp)
root       186  0.0  0.8  3212   180  ?  R N  02:44   0:02 /usr/local/gnu/lib/gcc-lib/sparc-sun-sunos4.1.4/2.6.3/cc1 -quiet -dumpbase info.c -O2 -Wall -Wstrict-prototypes -fomit-frame-pointer -fno-strength-reduce -o - 
root       187  0.0  0.0  1128     0  ?  SWN  02:44   0:00 (as)
root       188  0.0  0.0   660     0  ?  SWN  02:44   0:00 (gcc)
root       189  0.0  0.0  1316     0  ?  SWN  02:44   0:01 (cpp)
root       190  0.0  0.9  3144   204  ?  R N  02:44   0:02 (cc1)
root       191  0.0  0.0  1128     0  ?  SWN  02:44   0:00 (as)
root       192  0.0  0.0   660     0  ?  SWN  02:44   0:00 (gcc)
root       193  0.0  0.0  1188     0  ?  SWN  02:44   0:00 (cpp)
root       194  0.0  0.0  1104     0  ?  SWN  02:44   0:00 (cpp)
root       195  0.0  3.7  3072   824  ?  R N  02:44   0:01 /usr/local/gnu/lib/gcc-lib/sparc-sun-sunos4.1.4/2.6.3/cc1 -quiet -dumpbase resource.c -O2 -Wall -Wstrict-prototypes -fomit-frame-pointer -fno-strength-reduce -o - 
root       196  0.0  0.0  1128     0  ?  SWN  02:44   0:00 (as)
root       197  0.0  3.3  3104   740  ?  R N  02:44   0:02 /usr/local/gnu/lib/gcc-lib/sparc-sun-sunos4.1.4/2.6.3/cc1 -quiet -dumpbase softirq.c -O2 -Wall -Wstrict-prototypes -fomit-frame-pointer -fno-strength-reduce -o - 
root       198  0.0  0.0  1128     0  ?  SWN  02:44   0:00 (as)
root       199  0.0  0.0  1356     0  ?  SWN  02:44   0:01 (cpp)
root       200  0.0  2.5  2512   568  ?  R N  02:44   0:00 /usr/local/gnu/lib/gcc-lib/sparc-sun-sunos4.1.4/2.6.3/cc1 -quiet -dumpbase sysctl.c -O2 -Wall -Wstrict-prototypes -fomit-frame-pointer -fno-strength-reduce -o - 
root       201  0.0  0.0  1128     0  ?  SWN  02:44   0:00 (as)
root      1811  0.1  1.0   668   236  ?  S    03:20   0:00 /usr/etc/in.telnetd 

From owner-linux@cthulhu.engr.sgi.com  Fri May 10 04:21:09 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id EAA20001; Fri, 10 May 1996 04:21:09 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id LAA12246 for linux-list; Fri, 10 May 1996 11:19:43 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id EAA12234 for <linux@cthulhu.engr.sgi.com>; Fri, 10 May 1996 04:19:40 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id EAA19918 for <lmlinux@neteng.engr.sgi.com>; Fri, 10 May 1996 04:19:39 -0700
Received: from sgi.sgi.com (sgi.engr.sgi.com [150.166.76.30]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id EAA12225 for <lmlinux@neteng.engr.sgi.com>; Fri, 10 May 1996 04:19:38 -0700
Received: from caipfs.rutgers.edu by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	for <lmlinux@neteng.engr.sgi.com> id EAA23217; Fri, 10 May 1996 04:19:37 -0700
Received: from huahaga.rutgers.edu (huahaga.rutgers.edu [128.6.155.53]) by caipfs.rutgers.edu (8.6.9+bestmx+oldruq+newsunq+grosshack/8.6.9) with ESMTP id HAA08456; Fri, 10 May 1996 07:19:33 -0400
Received: (davem@localhost) by huahaga.rutgers.edu (8.6.9+bestmx+oldruq+newsunq+grosshack/8.6.9) id HAA00626; Fri, 10 May 1996 07:19:33 -0400
Date: Fri, 10 May 1996 07:19:33 -0400
Message-Id: <199605101119.HAA00626@huahaga.rutgers.edu>
From: "David S. Miller" <davem@caip.rutgers.edu>
To: lnz@dandelion.com
CC: sparclinux@vger.rutgers.edu, lmlinux@neteng.engr.sgi.com
Subject: check this out...
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk


(Leonard, I thought you'd get a kick out of this...)

I stress tested my new Sparc SCSI driver with the following bus
configuration:

	ID 3: SCSI2 Seagate disk
	ID 4: pre-CCS SCSI1 disk of unknown vendor type ;-)
	ID 5: FAST SCSI2 Conner drive

I ran three instances of a program which just did random seeks
forever, one on each drive. (thank god for lmbench programs which have
dual role as stress/stability testers ;-)

I was getting bus lockups now and then, and happily my abort/reset
code completely recovered the bus back into a working state and things
proceeded.  After about 30 minutes of analyzing state dumps from my
driver when this would happen I figure out what was going on.  My
driver now fixes this problem and does not reset anymore no matter how
hard I smash all the drives on this chain.

Anyways, to get to the point, for your edification Leonard, here is
the comment above my fix for the problem. Enjoy ;-)

		/* A fix for broken SCSI1 targets, when they disconnect
		 * they lock up the bus and confuse ESP.  So disallow
		 * disconnects for SCSI1 targets for now until we
		 * find a better fix.
		 *
		 * Addendum: This is funny, I figured out what was going
		 *           on.  The blotzed SCSI1 target would disconnect,
		 *           one of the other SCSI2 targets or both would be
		 *           disconnected as well.  The SCSI1 target would
		 *           stay disconnected long enough that we start
		 *           up a command on one of the SCSI2 targets.  As
		 *           the ESP is arbitrating for the bus the SCSI1
		 *           target begins to arbitrate as well to reselect
		 *           the ESP.  The SCSI1 target refuses to drop it's
		 *           ID bit on the data bus even though the ESP is
		 *           at ID 7 and is the obvious winner for any
		 *           arbitration.  The ESP is a poor sport and refuses
		 *           to lose arbitration, it will continue indefinately
		 *           trying to arbitrate for the bus and can only be
		 *           stopped via a chip reset or SCSI bus reset.
		 *           Therefore _no_ disconnects for SCSI1 targets
		 *           thank you very much. ;-)
		 */

Later,
David S. Miller
davem@caip.rutgers.edu

From owner-linux@cthulhu.engr.sgi.com  Mon May 13 07:28:58 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id HAA24667; Mon, 13 May 1996 07:28:57 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id OAA18053 for linux-list; Mon, 13 May 1996 14:27:32 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id HAA18044 for <linux@cthulhu.engr.sgi.com>; Mon, 13 May 1996 07:27:30 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id HAA24594 for <lmlinux@neteng.engr.sgi.com>; Mon, 13 May 1996 07:27:29 -0700
Received: from sgi.sgi.com (sgi.engr.sgi.com [150.166.76.30]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id HAA18037 for <lmlinux@neteng.engr.sgi.com>; Mon, 13 May 1996 07:27:28 -0700
Received: from caipfs.rutgers.edu by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	for <lmlinux@neteng.engr.sgi.com> id HAA02189; Mon, 13 May 1996 07:27:26 -0700
Received: from huahaga.rutgers.edu (huahaga.rutgers.edu [128.6.155.53]) by caipfs.rutgers.edu (8.6.9+bestmx+oldruq+newsunq+grosshack/8.6.9) with ESMTP id AAA28631; Mon, 13 May 1996 00:05:06 -0400
Received: (davem@localhost) by huahaga.rutgers.edu (8.6.9+bestmx+oldruq+newsunq+grosshack/8.6.9) id AAA05898; Mon, 13 May 1996 00:05:04 -0400
Date: Mon, 13 May 1996 00:05:04 -0400
Message-Id: <199605130405.AAA05898@huahaga.rutgers.edu>
From: "David S. Miller" <davem@caip.rutgers.edu>
To: tridge@cs.anu.edu.au
CC: lmlinux@neteng.engr.sgi.com, torvalds@cs.helsinki.fi
Subject: wicked checksum optimization...
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk


I think I figured out how to do it all "the right way(tm)"

The big problem is alignment, but %97 of the time the buffer is
aligned how we like.  I've decided it is ok to take the hit of an
unaligned access trap for the %3 cases, but not that much of a hit.
The implementation looks like this:

All loads and stores in the ip checksum routines will look the same,
the only time we do stores is for the csum/copy routines.  Anyways the
eight instruction codes recognized will be for:

	ld	[%o0 + offset], %o4
	ld	[%o0 + offset], %o5
	lduh	[%o0 + offset], %o4
	lduh	[%o0 + offset], %o5
	st	%o4, [%g3 + offset]
	st	%o5, [%g3 + offset]
	sth	%o4, [%g3 + offset]
	sth	%o5, [%g3 + offset]

The unaligned trap handler (before it even tries to save any state)
will look something like:

mna_trap:
	andcc	%l0, PSR_PS, %g0
	be,a	mna_fromuser

	ld	[%l1], %l5
	sethi	%hi(LOAD_O4), %l4
	and	%l5, %l4, %l6
	cmp	%l6, %l4
	bne	1f
	 sethi	%hi(LOAD_O5), %l4
	mov	%l1, %g6				! %pc
	sethi	%hi(C_LABEL(csum_ldo4_fixup)), %l1
	or	%l1, %lo(C_LABEL(csum_ldo4_fixup)), %l1
	wr	%l0, 0x0, %psr				! fix cond-codes
	and	%l5, LOAD_IMMEDIATE_FIELD, %g7
	srl	%g7, LOAD_IMMEDIATE_SHIFT, %g7		! offset
	jmp	%l1
	rett	%l1 + 0x4

1:
	/* etc. for other instructions recognized */

mna_fromuser:
	SAVE_ALL
	/* From user mode or something we don't handle for the
	 * kernel.
	 */
	call	C_LABEL(do_mna)
	 nop
	RESTORE_ALL

Ok, now the fixup routines just look like:

csum_ldo4_fixup:
	ldub	[%o0 + %g7], %g4
	add	%g7, 1, %g7
	ldub	[%o0 + %g7], %g5
	sll	%g4, 24, %g4
	add	%g7, 1, %g7
	sll	%g5, 16, %g5
	or	%g4, %g5, %o4
	ldub	[%o0 + %g7], %g4
	add	%g7, 1, %g7
	ldub	[%o0 + %g7], %g5
	sll	%g4, 8, %g4
	or	%g5, %g4, %g4
	jmp	%g6		! wheee...
	or	%o4, %g4, %o4

and so on...  then csum_parial and friends can just blaze through
assuming proper alignment for all pointers to the packet contents
etc.  Nifty eh?  Sparc is fun...

Later,
David S. Miller
davem@caip.rutgers.edu

From owner-linux@cthulhu.engr.sgi.com  Mon May 13 18:24:17 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id SAA02898; Mon, 13 May 1996 18:24:17 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id BAA11019 for linux-list; Tue, 14 May 1996 01:22:53 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id SAA11014 for <linux@cthulhu.engr.sgi.com>; Mon, 13 May 1996 18:22:52 -0700
Received: from localhost (lm@localhost) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via SMTP id SAA02367 for <lmlinux>; Mon, 13 May 1996 18:22:51 -0700
Message-Id: <199605140122.SAA02367@neteng.engr.sgi.com>
To: lmlinux@neteng.engr.sgi.com
Subject: Uselinux and the Linux/SPARC port (forwarded)
Date: Mon, 13 May 1996 18:22:51 -0700
From: Larry McVoy <lm@neteng.engr.sgi.com>
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk


------- Forwarded Message

Date:    Sat, 11 May 1996 10:52:53 -0500
From:    Miguel de Icaza <miguel@roxanne.nuclecu.unam.mx>
To:      sparclinux-cvs@caipfs.rutgers.edu
cc:      davem@caip.rutgers.edu, tridge@cs.anu.edu.au
Subject: Uselinux and the Linux/SPARC port


Hello guys, 

   On Jannuary the Uselinux conference will be help together with the
Usenix technical conference (http://www.usenix.org).  David and I had
plans to make a technical presentation on the Linux/SPARC port.  I
think Linux/SPARC is not just another port, but a port that has some
unique features not found on other ports.

   The idea is to co-author this paper with those of you wanting to
help/contribute to the article.

   I have assembled a list of things I think are interesting on
Linux/SPARC, I guess that before we can make a presentation on why we
find Linux/SPARC to be a unique port we will have to massage this
quite a bit.  I have already mailed this one to Michael Johnson (He is
the technical session chair I think), if you want to co-author this
with David and I please, contact me (and CC the sparclinux-cvs list.
David will be very busy before leaving to SGI, so I'm doing the
coordination for this).

Best wishes,
Miguel.

* MMU/cache

   Sparc have 3 types of MMU -- this is easy to plug into Linux.
   
   Sparc have Virtual Addressed Caches (VAC) and Physical caches --
   these require changes to Linux kernel.  With the VACs, one must be
   careful to not complete flush the caches as it was being done
   initially, because they damage performace quite a bit.
   
   As an extra challenge, some MMU are buggy, MMUs are accessed in
   different ways.

   Linux/SPARC uses pointer functions for most MMU/cache manipulation 

* SunOS compatibility.

   We use the model that Linus used to get OSF/1 compatibility.
   (we use same constants, and same structures for the kernel, 
   thus no need to translate anything).

   This model lets us have a userland ready early during the port and
   the code is written when something fails to run properly.
   
   Compare this to NetBSD, 4114 lines of code for Sparc code plus 2753
   lines of generic emulation "libraries"; compare this to Linux: 1140
   lines.  We think this is the right way to future ports.  

   Linux emulation code is mostly related to features not found
   usually in Linux (like assumption from SunOS applications on the
   OS), while NetBSD's emulation code is spent on doing conversions
   in/from each OS constants and data structures, so Linux does not
   have a runtime performance penalty for running SunOS applications.

   Sparc drivers have to support the Linux interface and the SunOS
   interface to make some programs happy (X-Window), the mouse driver,
   keyboard driver and frame buffer drivers provide SunOS interfaces.
   The mouse and keyboard driver provide the same interface to Linux
   applications as it is found on original i386 port.

* Drivers for undocumented hardware.

   Getting information out of Sun is not very easy.  Documentation for
   some hardware came from the vendor that makes the chips (scsi and
   lance drivers), sources for the NetBSD, Sprite and the Xinu
   operating system were used as references at the beginning of the
   port to understand the architecture.  

   bwtwo, cg3 and cg6 drivers were based on existing NetBSD drivers
   and a generic frame buffer interface for Linux/SPARC was coded.  A
   driver for the cg14 has been written using only Solaris include
   file for the video card and some poking at the rom letting us
   emulate a lowend graphics card on it.

* Stability and performance.

   During the port the use of crashme has been constantly used to avoid
   the same mistakes on Sun operating system.  

   lmbench has been used to test the performance of the operating
   system.  It was not just used as a benchmarking tool, but also to
   pinpoint weaknesses in the port.  After a couple of weeks tunning
   the port, Linux/SPARC was able to keep up against SunOS and Solaris
   on the same hardware and in some cases outperforming them.

   The lmbench results are more interesting than they may appear at
   first sight: They do not only reflect that Linux is a great
   operating system, but most sadly it reflects the fact that
   corporate operating systems are sometimes bloated and slow.  The
   reason for bloated operating systemd may come from different
   sources, as documented in the 4.3BSD book, there were 13 memory
   allocators on the BSD kernel in 1986, at that point they were aware
   of the problem: code is being rewritten over and over.  This in my
   opinion means that most kernel programmers lack a deep knowledge on
   the operating system and may be writting thing that are not
   completely clean   

   (anecdote: over the past two weeks I made one remark and one
   suggestion to the kernel to Linus, and even when they looked fine
   to me and many people agree they are Linus quickly pointed me out
   where the design flaws were in; it was related rfork, btw).

* SMP

   Problems encountered when porting Linux to the Sparc.  Not all SMP
   machines are born the same, there are some hacks required to get
   SMP working on the Sparc.  

   [Ufinished section I will complete this section later]

* Portability and the AP+

   The Linux/SPARC port is itself a relatively easy to port to non Sun
   hardware.  Andrew Tridgell and his hackers team in Canberra have
   ported Linux/SPARC to the AP+ multiprocessor.

   [ This section is still  not finished ]

* Bootstraping the SPARC.

   Booting the SPARC required: a) cleaning up the ext2fs to make it
   aware of the endianess of the SPARC;  b) the merging into the
   kernel of the nfsroot allowed one of the developers to work without
   a hard disk, and let people test drive Linux kernel without needing
   to reformat their disks.

   Jay Eastabrook's Alpha console code split was used as the second
   attempt at the Linux console code.  Once we had this one, we got
   the same functionality of Linux/i386 on the sparc.

* SILO.
 
   SILO is unique in one aspect:  It is the first Linux boot loader
   that uses the ext2fs library from the ext2fs tool suite by Remy
   Card and Ted Tso.  This is a good thing because it let us write the
   boot loader in a very short time frame.  

   The SILO bootloader fully understands the ext2fs and thus does not
   require a special boot loader installer for each kernel image that
   is made available.  There is still work in progress on LILO to make
   it work with iso9660 cdroms and boot loading.

* The port

   David Miller setup an account for the developers on vger to have
   access to his CVS tree so that we could make changes directly to
   the source tree without taking time from him.  Simple rules were
   set: test before you commit your code.

   Another idea that was pushed since the beginning was to keep track
   of current kernel developement.  Even if this would imply that we
   had to spend some time merging and fixing sparc stuff on our
   kernels, it helped us to merge our tree faster and easier than the
   other ports.  Letting versions slip proved to be not a very good
   idea, and the MkLinux people will suffer with it as did the m68k
   guys some months ago. 

* Userland

   The first attempt at a Libc port was done by porting Linux's a.out
   Libc4 to the SPARC.  Later a GNU libc port was attempted, but
   because of the adapting nature of the kernel to the host OS for
   binary compatibility, the GNU libc port was found to not be
   possible at the time, we may try again as soon as some issues are
   hashed out on glibc.  

   Currently the Linux Libc 5 is in use and ELF loading has been fixed
   into the kernel.  Eric Youngdale is making the elf loader less
   i386-centric, and thus our patches will go in easier.

   The libc5 supports shared executables and Eric's ld.so linker has
   been ported to the SPARC, as well as adapting it to Linux
   (originally it was written and tested on Solaris).  What is amazing
   is that this linker is quite portable nowadays.

   RedHat and Debian are both working on Linux distributions for the
   SPARC.  The goal is to have the same interface on all different
   architectures and in the future encourage commercial software
   companies to compile with a cross compiler versions of their
   software for all the available platforms of Linux.

   

------- End of Forwarded Message


From owner-linux@cthulhu.engr.sgi.com  Mon May 13 18:24:32 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id SAA03012; Mon, 13 May 1996 18:24:31 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id BAA11067 for linux-list; Tue, 14 May 1996 01:23:08 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id SAA11062 for <linux@cthulhu.engr.sgi.com>; Mon, 13 May 1996 18:23:07 -0700
Received: from localhost (lm@localhost) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via SMTP id SAA02399 for <lmlinux>; Mon, 13 May 1996 18:23:06 -0700
Message-Id: <199605140123.SAA02399@neteng.engr.sgi.com>
To: lmlinux@neteng.engr.sgi.com
Subject: Linux/AP+ technical report (forwarded)
Date: Mon, 13 May 1996 18:23:06 -0700
From: Larry McVoy <lm@neteng.engr.sgi.com>
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk


------- Forwarded Message

Date:    Sat, 11 May 1996 22:51:12 +1000
From:    Andrew Tridgell <tridge@cs.anu.edu.au>
To:      Multiple recipients of list <linux-mc@arvidsjaur.anu.edu.au>
Subject: Linux/AP+ technical report

Hi,

We've written a short tech report on what we've done so far to port
Linux to the AP1000+ multicomputer.

Its available on our home page at
http://cap.anu.edu.au/cap/projects/linux

We'd be very interested to hear comments on what you think of our
approach. 

Obviously our port is still pretty simple, and a lot more needs to be
done to make a really multicomputer OS, but at least its a start :-)

Andrew

PS: I'm also hoping this might start some discussion on this list!

------- End of Forwarded Message


From owner-linux@cthulhu.engr.sgi.com  Tue May 14 20:30:16 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id UAA09081; Tue, 14 May 1996 20:30:16 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id DAA07340 for linux-list; Wed, 15 May 1996 03:28:51 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id UAA07333 for <linux@cthulhu.engr.sgi.com>; Tue, 14 May 1996 20:28:49 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id UAA09037 for <lmlinux@neteng.engr.sgi.com>; Tue, 14 May 1996 20:28:48 -0700
Received: from sgi.sgi.com (sgi.engr.sgi.com [150.166.76.30]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id UAA07320; Tue, 14 May 1996 20:28:48 -0700
Received: from caipfs.rutgers.edu by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	 id UAA08560; Tue, 14 May 1996 20:28:45 -0700
Received: from huahaga.rutgers.edu (huahaga.rutgers.edu [128.6.155.53]) by caipfs.rutgers.edu (8.6.9+bestmx+oldruq+newsunq+grosshack/8.6.9) with ESMTP id XAA00765; Tue, 14 May 1996 23:28:39 -0400
Received: (davem@localhost) by huahaga.rutgers.edu (8.6.9+bestmx+oldruq+newsunq+grosshack/8.6.9) id XAA24278; Tue, 14 May 1996 23:28:38 -0400
Date: Tue, 14 May 1996 23:28:38 -0400
Message-Id: <199605150328.XAA24278@huahaga.rutgers.edu>
From: "David S. Miller" <davem@caip.rutgers.edu>
To: lm@gate1-neteng.engr.sgi.com
CC: lmlinux@neteng.engr.sgi.com, torvalds@cs.helsinki.fi,
        sparclinux-cvs@caipfs.rutgers.edu
Subject: numbers...
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk


Two nights of coding, not bad...

This has to be the fastest csum and csum_copy humanly codable on the
Sparc.  Here are some initial results.  On each run the cache was
purposely forced to be cold on each iteration:

sun4m SS10 115mhz hypersparc, 256k cache
measure_csum_partial
csum_partial: sz[1024] 10000 iterations takes 17009430 microseconds
csum_partial: sz[1024] 1 iteration takes <1700 microseconds>==<332 nanoseconds>
csum_partial: sz[2048] 10000 iterations takes 32609649 microseconds
csum_partial: sz[2048] 1 iteration takes <3260 microseconds>==<636 nanoseconds>
csum_partial: sz[3072] 10000 iterations takes 48152422 microseconds
csum_partial: sz[3072] 1 iteration takes <4815 microseconds>==<940 nanoseconds>
csum_partial: sz[4096] 10000 iterations takes 63708679 microseconds
csum_partial: sz[4096] 1 iteration takes <6370 microseconds>==<1244 nanoseconds>

measure_csum_partial_copy
csum_partial_copy: sz[1024] 10000 iterations takes 30865212 microseconds
csum_partial_copy: sz[1024] 1 iteration takes <3086 microseconds>==<602 nanoseconds>

I'm estimating the function call overhead is around 29 nanoseconds to
get into the damn routine, and around 3 or 4 nanoseconds overhead from
the loop constructs etc. in the benchmark.  It seems to scale very
nicely, on this chip you tend eat around 8ns for byte end aligned
buffers and around 5ns for an extraneous halfword at the end of the
buffer.  0.4us/Kbyte for csum, and 0.6us/Kbyte for csum_copy, _really_
impressive.

Here are some figures for small buffers on the same cpu:

csum_partial: sz[20] 200000 iterations takes 55822206 microseconds
csum_partial: sz[20] 1 iteration takes <279 microseconds>==<54 nanoseconds>
csum_partial: sz[21] 200000 iterations takes 64016336 microseconds
csum_partial: sz[21] 1 iteration takes <320 microseconds>==<62 nanoseconds>
csum_partial: sz[22] 200000 iterations takes 61093525 microseconds
csum_partial: sz[22] 1 iteration takes <305 microseconds>==<59 nanoseconds>
csum_partial: sz[23] 200000 iterations takes 65481187 microseconds
csum_partial: sz[23] 1 iteration takes <327 microseconds>==<63 nanoseconds>
csum_partial: sz[24] 200000 iterations takes 61168735 microseconds
csum_partial: sz[24] 1 iteration takes <305 microseconds>==<59 nanoseconds>
csum_partial: sz[25] 200000 iterations takes 69001025 microseconds
csum_partial: sz[25] 1 iteration takes <345 microseconds>==<67 nanoseconds>
csum_partial: sz[26] 200000 iterations takes 66418949 microseconds
csum_partial: sz[26] 1 iteration takes <332 microseconds>==<64 nanoseconds>
csum_partial: sz[27] 200000 iterations takes 71029491 microseconds
csum_partial: sz[27] 1 iteration takes <355 microseconds>==<69 nanoseconds>
csum_partial: sz[28] 200000 iterations takes 66360048 microseconds
csum_partial: sz[28] 1 iteration takes <331 microseconds>==<64 nanoseconds>
csum_partial: sz[29] 200000 iterations takes 74353705 microseconds
csum_partial: sz[29] 1 iteration takes <371 microseconds>==<72 nanoseconds>
csum_partial: sz[30] 200000 iterations takes 71737147 microseconds
csum_partial: sz[30] 1 iteration takes <358 microseconds>==<69 nanoseconds>
csum_partial: sz[31] 200000 iterations takes 76436420 microseconds
csum_partial: sz[31] 1 iteration takes <382 microseconds>==<74 nanoseconds>
csum_partial: sz[32] 200000 iterations takes 40741293 microseconds
csum_partial: sz[32] 1 iteration takes <203 microseconds>==<39 nanoseconds>
csum_partial: sz[33] 200000 iterations takes 48651585 microseconds
csum_partial: sz[33] 1 iteration takes <243 microseconds>==<47 nanoseconds>
csum_partial: sz[34] 200000 iterations takes 46065031 microseconds
csum_partial: sz[34] 1 iteration takes <230 microseconds>==<44 nanoseconds>
csum_partial: sz[35] 200000 iterations takes 50451529 microseconds
csum_partial: sz[35] 1 iteration takes <252 microseconds>==<49 nanoseconds>
csum_partial: sz[36] 200000 iterations takes 45143096 microseconds
csum_partial: sz[36] 1 iteration takes <225 microseconds>==<43 nanoseconds>
csum_partial: sz[37] 200000 iterations takes 53111738 microseconds
csum_partial: sz[37] 1 iteration takes <265 microseconds>==<51 nanoseconds>
csum_partial: sz[38] 200000 iterations takes 50421138 microseconds
csum_partial: sz[38] 1 iteration takes <252 microseconds>==<49 nanoseconds>
csum_partial: sz[39] 200000 iterations takes 54893618 microseconds
csum_partial: sz[39] 1 iteration takes <274 microseconds>==<53 nanoseconds>
csum_partial: sz[40] 200000 iterations takes 50453690 microseconds
csum_partial: sz[40] 1 iteration takes <252 microseconds>==<49 nanoseconds>
csum_partial: sz[41] 200000 iterations takes 58411664 microseconds
csum_partial: sz[41] 1 iteration takes <292 microseconds>==<57 nanoseconds>
csum_partial: sz[42] 200000 iterations takes 55732943 microseconds
csum_partial: sz[42] 1 iteration takes <278 microseconds>==<54 nanoseconds>
csum_partial: sz[43] 200000 iterations takes 60187768 microseconds
csum_partial: sz[43] 1 iteration takes <300 microseconds>==<58 nanoseconds>
csum_partial: sz[44] 200000 iterations takes 55766810 microseconds
csum_partial: sz[44] 1 iteration takes <278 microseconds>==<54 nanoseconds>
csum_partial: sz[45] 200000 iterations takes 63732553 microseconds
csum_partial: sz[45] 1 iteration takes <318 microseconds>==<62 nanoseconds>

Now, some other CPU types.

sun4m MicroSparcI, 40mhz, 16k icache 20k dcache (can't remember if
thats right or not...):
measure_csum_partial
csum_partial: sz[1024] 10000 iterations takes 62380730 microseconds
csum_partial: sz[1024] 1 iteration takes <6238 microseconds>==<1218 nanoseconds>
csum_partial: sz[2048] 10000 iterations takes 127199716 microseconds
csum_partial: sz[2048] 1 iteration takes <12719 microseconds>==<2484 nanoseconds>

measure_csum_partial_copy
csum_partial_copy: sz[1024] 10000 iterations takes 180276825 microseconds
csum_partial_copy: sz[1024] 1 iteration takes <18027 microseconds>==<3520 nanoseconds>

Thats around 2.5us/Kbyte for csum, 3.6us/Kbyte for csum_copy, not bad
and could be a bit better with some reworking of the scheduling tuned
for the msI instruction cache to get less stalls.  Probably not worth
it though (I come to this conclusion notably because when Gordon Irlam
reworked the SunOS/Solaris window trap handlers to do an extra window
for each trap for cache reasons it turned out to make no difference
performance wise in the "real world" although the code was indeed more
efficient on the msI).

sun4m SS10 Viking/MXCC, 50Mhz, 1mb physical cache, 16k icache
20k dcache (again, correct me here if I am wrong on the i/d sizes):
measure_csum_partial
csum_partial: sz[1024] 10000 iterations takes 31893405 microseconds
csum_partial: sz[1024] 1 iteration takes <3189 microseconds>==<622 nanoseconds>
csum_partial: sz[2048] 10000 iterations takes 60608539 microseconds
csum_partial: sz[2048] 1 iteration takes <6060 microseconds>==<1183 nanoseconds>
csum_partial: sz[3072] 10000 iterations takes 89327514 microseconds
csum_partial: sz[3072] 1 iteration takes <8932 microseconds>==<1744 nanoseconds>
csum_partial: sz[4096] 10000 iterations takes 118067205 microseconds
csum_partial: sz[4096] 1 iteration takes <11806 microseconds>==<2305 nanoseconds>

measure_csum_partial_copy
csum_partial_copy: sz[1024] 10000 iterations takes 45946764 microseconds
csum_partial_copy: sz[1024] 1 iteration takes <4594 microseconds>==<897 nanoseconds>
csum_partial_copy: sz[2048] 10000 iterations takes 88614537 microseconds
csum_partial_copy: sz[2048] 1 iteration takes <8861 microseconds>==<1730 nanoseconds>
csum_partial_copy: sz[3072] 10000 iterations takes 131338863 microseconds
csum_partial_copy: sz[3072] 1 iteration takes <13133 microseconds>==<2565 nanoseconds>

The huge physical cache certainly seems to make a difference, although
I think it is also noticable that the on-chip insn cache would do
better with my algorithm if it were just a tad bigger.  Ho hum...
Note also how it doesn't scale as linearly as on the Hyper and the
msI, again this puzzles me bacaus the icache is smaller on msI.

sun4c SLC, 20MHZ, 64k I/D combined L2 virtual cache:
measure_csum_partial
csum_partial: sz[1024] 10000 iterations takes 227199286 microseconds
csum_partial: sz[1024] 1 iteration takes <22719 microseconds>==<4437 nanoseconds>

measure_csum_partial_copy
csum_partial_copy: sz[1024] 10000 iterations takes 539569714 microseconds
csum_partial_copy: sz[1024] 1 iteration takes <53956 microseconds>==<10538 nanoseconds>

Not too bad for this piece of garbage, the pure virtual 64k cache is
probably the real helper here.  I see no other factor that can get
numbers like this on such a shit cpu.  4.5ms/Kbyte for csum, and
10.6ms/Kbyte for csum/copy... shit Jacobsons m68k algorithm only gets
134us/Kbyte on a 20MHZ 68020.

sun4c IPX, 40MHZ, 64k I/D combined L2 virtual cache:
measure_csum_partial
csum_partial: sz[1024] 10000 iterations takes 112377730 microseconds
csum_partial: sz[1024] 1 iteration takes <11237 microseconds>==<2194 nanoseconds>
csum_partial: sz[2048] 10000 iterations takes 223668577 microseconds
csum_partial: sz[2048] 1 iteration takes <22366 microseconds>==<4368 nanoseconds>

measure_csum_partial_copy
csum_partial_copy: sz[1024] 10000 iterations takes 345064536 microseconds
csum_partial_copy: sz[1024] 1 iteration takes <34506 microseconds>==<6739 nanoseconds>

A little more than twice as fast as the SLC, again the 64k virtual
cache is the performance factor even here and it seems the cache can
keep up with this cpu when it's hot.  2.2us/Kbyte for csum, and
6.8us/Kbyte for csum_copy, not bad at all.

I've verified all of my new code with a very extensive "cksum_helper"
program I wrote which also ran the above benchmarks after the
verification suite completed successfully.  Now I just have to stick
this shit into the kernel and see if it goes ok.  If all goes well we
might end up being able to beat Solaris on those TCP lmbench bandwidth
numbers, no promises.

Later,
David S. Miller
davem@caip.rutgers.edu

From owner-linux@cthulhu.engr.sgi.com  Wed May 15 09:42:22 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id JAA01100; Wed, 15 May 1996 09:42:21 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id QAA26408 for linux-list; Wed, 15 May 1996 16:40:57 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id JAA26403 for <linux@cthulhu.engr.sgi.com>; Wed, 15 May 1996 09:40:55 -0700
Received: from lanta.engr.sgi.com (lanta.engr.sgi.com [192.26.75.15]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id JAA01002 for <lmlinux@neteng.engr.sgi.com>; Wed, 15 May 1996 09:40:55 -0700
Received: by lanta.engr.sgi.com (940816.SGI.8.6.9/911001.SGI)
	 id JAA08277; Wed, 15 May 1996 09:40:53 -0700
Date: Wed, 15 May 1996 09:40:53 -0700
From: nn@lanta.engr.sgi.com (Neal Nuckolls)
Message-Id: <199605151640.JAA08277@lanta.engr.sgi.com>
To: davem@caip.rutgers.edu
Subject: Re: numbers...
Cc: sparclinux-cvs@caipfs.rutgers.edu, torvalds@cs.helsinki.fi,
        lmlinux@neteng.engr.sgi.com
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk


> sun4m SS10 115mhz hypersparc, 256k cache
> measure_csum_partial
> csum_partial: sz[1024] 10000 iterations takes 17009430 microseconds
> csum_partial: sz[1024] 1 iteration takes <1700 microseconds>==<332 nanoseconds>

Maybe I'm dense this morning but I don't understand the numbers.

You measure the realtime to do a sequential IP checksum on a 1K
buffer and you always invalidate the processor cache (I$, D$, and
Secondary$ or just some subset of these?) and don't include the
cost of doing the cache invalidate in the time above.  10000 of
these costs a hair over 17s, divide by 10000 gives 1.7ms per
iteration. This seems way high so must include the cost of the
cache invalidate or something?

What does the 332ns refer to?

My back of the envelope:
If I recall, the cacheline on a sun4m is 32bytes.  Assuming
something in the 300-600ns/secondarycachemiss range and a single
pending cachemiss at a time would put most any "touch the data"
operation on 1K of data in the 9.6-19.2us ballpark or 9-18ns/byte
range.

Does the hypersparc processor support multiple concurrent cache
misses? Or does it have a Viking-like sequential reference
detector and automatic cache prefetch logic?

> sun4m MicroSparcI, 40mhz, 16k icache 20k dcache
> ...
> Thats around 2.5us/Kbyte for csum, 

This must be hot$ that you're talking about?  Exactly when do you
invalidate the caches?

thanks.

neal

From owner-linux@cthulhu.engr.sgi.com  Wed May 15 15:44:48 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id PAA21025; Wed, 15 May 1996 15:44:47 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id WAA26316 for linux-list; Wed, 15 May 1996 22:43:22 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id PAA26301 for <linux@cthulhu.engr.sgi.com>; Wed, 15 May 1996 15:43:21 -0700
Received: from localhost (lm@localhost) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via SMTP id PAA19394; Wed, 15 May 1996 15:43:19 -0700
Message-Id: <199605152243.PAA19394@neteng.engr.sgi.com>
To: tridge@cs.anu.edu.au
From: lm@gate1-neteng.engr.sgi.com (Larry McVoy)
cc: linux-mc@arvidsjaur.anu.edu.au, Linus.Torvalds@cs.Helsinki.FI,
        linux@neteng.engr.sgi.com, alan@cymru.net
Subject: Re: mpp kernel interface 
Date: Wed, 15 May 1996 15:43:19 -0700
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

: With 30 people now on this list there must be someone else who wants
: to say something ...

I'm going to try.  I'm probably going to do a half assed job, as I am 
feeling a bit sick today.  But I'll give it a go.

: It really depends on how we end up using this stuff. Does anyone out
: there have any experience with implementing things like remote fork,
: remote paging, remote files/sockets etc? What sorts of interfaces are
: useful for doing stuff like that?

I have a few ideas.  Lots of them are stolen from the UCAL Locus project, 
Jim Bob says check it out.

First, this is mostly a naming problem.  Wrap your brain around that.  We
need to have clear in our heads exactly how everything is named. Once
you can draw that picture, the implementation becomes straight forward.
After you read through this, you may or may not be in basic agreement.
If we ever converge, I will volunteer to write "man pages" for each of
the chunks of work so that we have a well documented TODO list.  It is
as close as I've ever gotten to a spec.

Things we need to name

	hosts
	cluster
	processes
	devices
	files
	sockets

other things

	cluster {join,leave}
	SMP vs cluster?

The hard one is sockets.  I've never seen a good solution to that.
I'll come back to that.  First, I want to go through the other ones and
offer suggestions.

Hosts are just hosts, nothing changes there.

Clusters are a mapping of "cluster_name" -> "list of hosts".  Think
of getclustbyname().  I did this for SparcClusters, it worked pretty
well.  The cluster is the name that you use to get to your cluster.
This implies that you wack rlogin/telnet and/or you wack DNS to 
to translate from the cluster name to the hostname to the IP address
of that host.  DNS is the easiest place.

Processes have two major chunks of work, the PID name space and how you
do remote processes.  For the PID name space, make pid_t a 32 bit int,
make the top 16 bits the host part, and the bottom 16 bits the pid part.
(We need to come back to the host part when we discuss process migration.)
A host part of "0" means "this host".  So a "kill -HUP 1" will always
restart init.

Remote processes.  This part gets fun, fun, fun.  Think of how we have
files as objects in the kernel, in C++ terms, an inode is a class with all
virtual methods - each instance of the class implments its own methods.
We need to do that for processes.  For every operation we currently
do on a process, we need to make that a function call.  You want this
layer to be very thin, it can't be a performance hit for local processes
or we blew it.  The intances of the process class I imagine are at
least "local", "remote", "local debugged", "remote debugged".  I can
also image using this to implement "local or remote, gang scheduled".
See how that works?  I'm hand waving like crazy here.  I'm not positive
the debugging stuff works in this layer, but I would love it if all the
System V /proc crap was buried in an optional instance of the process
class.  I believe that you can make the non debugged instance faster.

Devices I sort of punt on.  For device access, I would just use the 
remote mag tape protocol (or something very, very similar) so that all
of the locking, etc., still works - since you ship all the requests to
the system w/ the device, that kernel can implement the locks.  Any
issues here?

Files.  I have also punted on this.  I have never gotten that excited about
a cache coherent distributed file system, though others certainly do.  It's
not because I don't think it is useful.  I believe it can be done and that
we have the talent right here on this list to do it.  So I'll bow out of
commenting on it, other than to say make sure mmap works right.

Sockets.  This is a hard problem.  Some people think that a socket
should stick around after the CPU that created the socket has crashed.
I don't know how to do that efficiently, so I punt.  The only thing I
would suggest in the socket domain is (a) make gethostbyname("cluster")
be translated into a getclustbyname("cluster") so that you can think of
the cluster as one host (DNS or someone is round robining the host you
actually get), and (b) manage the routing tables within the cluster far
more dynamically that routing tables are corrently managed.  The latter
means that you notice immediately when a CPU leaves the cluster and
update your routing tables to route around that dead CPU.

Cluster {join,leave}  This turns out to be a thorny area.  You gotta get it
right, too.  You want the cluster to keep working in the face of a crashed
node.  You also want nodes to be taken out and added back into the cluster
for whatever reasons.  There's a whole set of protocol issues here that I'm 
too sick to describe, trust me that we have a lot of work in this area.

SMP vs cluster.  Lots of people wonder about this - do you do one, both, 
neither?  The smartest people I know, have all concluded the same thing:
you do 4 way SMP nodes, and cluster those to scale beyond the 4 CPUs.
The reasoning is as follows: when you are creating processes, you 
are striping them across the cluster (if you have a coherent distributed
mmap, rfork() is not that hard, seems worse than it is, but as a first
pass, just do rexec(), that will scale parallel make which is a great
test case).  It turns out from cache affinity studies that you really
want to avoid moving processes from one CPU/cache/memory to another.
It screws up your performance.  On the other hand, if you get unlucky
with your striping alg and land a couple of long running processes on
the same node, it is nice to have more than one CPU there to handle the
load.  Another way to state this: you need to balance your load.  If
you try and statically balance it at fork and/or exec time, you will
make mistakes because you don't have enough info at that point to do 
perfect scheduling job.  If you are sending jobs to SMPs instead of
uniprocessers, the SMPs can do a reasonable job of dynamically scheduling
the load.  And a 4 way SMP is a good first order approximation of an
infinitely large SMP, it is rare that you'll screw up so badly that 
you need more than 4 processors to smooth out the load.

So why am I beating on this issue so hard?  Because I don't want a 
fine grained threaded SMP kernel.  Threading the kernel that much introduces
way too much of a performance penalty for the simple case of "run this
one job".  Also consider that the SMP kernel may well form the basis for
a fully preemptable uniprocessor kernel.  I do not want to sacrafice more
than 1% performance on the uniprocessor to get preemption.  Solaris and
IRIX both suck at this - you pay big time for that checklist item.  

If you walk into the SMP arena with the model that you "fork" your system
every 4 processors, then you don't have to work so hard to get scaling.
Coarse grained, non intrusive locking will work just fine for 4 processors
and you leave the rest to the cluster.  Cool?

Finally - when doing all of this stuff, please do a 100Mbit ethernet based
version as well as the AP version.  If you come at it from a network point
of view, a whole bunch of problems will _not_ happen in the AP version.
When you have all that nice hardware, you tend to use it and it can
screw up the architecture such that a network based cluster won't work.
On the other hand, if you do a network based cluster, you are guarenteed
that you have a partioned solution.  As you try and make all of those
kernels work on that one big shared memory, you'll find that partitioning
is a big performance win.  Coming from a network cluster, you'll get that
without having to work for it - the other way frequently is harder.

That's enough for now, how about some comments?

--lm

From owner-linux@cthulhu.engr.sgi.com  Wed May 15 23:23:37 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id XAA11138; Wed, 15 May 1996 23:23:37 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id GAA20606 for linux-list; Thu, 16 May 1996 06:22:10 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id XAA20601 for <linux@cthulhu.engr.sgi.com>; Wed, 15 May 1996 23:22:09 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id XAA11092 for <lmlinux@neteng.engr.sgi.com>; Wed, 15 May 1996 23:22:08 -0700
Received: from sgi.sgi.com (sgi.engr.sgi.com [150.166.76.30]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id XAA20594; Wed, 15 May 1996 23:22:06 -0700
Received: from caipfs.rutgers.edu by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	 id XAA13123; Wed, 15 May 1996 23:22:04 -0700
Received: from huahaga.rutgers.edu (huahaga.rutgers.edu [128.6.155.53]) by caipfs.rutgers.edu (8.6.9+bestmx+oldruq+newsunq+grosshack/8.6.9) with ESMTP id CAA21020; Thu, 16 May 1996 02:21:12 -0400
Received: (davem@localhost) by huahaga.rutgers.edu (8.6.9+bestmx+oldruq+newsunq+grosshack/8.6.9) id CAA10579; Thu, 16 May 1996 02:21:12 -0400
Date: Thu, 16 May 1996 02:21:12 -0400
Message-Id: <199605160621.CAA10579@huahaga.rutgers.edu>
From: "David S. Miller" <davem@caip.rutgers.edu>
To: nn@lanta.engr.sgi.com
CC: sparclinux-cvs@caipfs.rutgers.edu, torvalds@cs.helsinki.fi,
        lmlinux@neteng.engr.sgi.com
In-reply-to: <199605151640.JAA08277@lanta.engr.sgi.com>
	(nn@lanta.engr.sgi.com)
Subject: Re: numbers...
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

   Date: Wed, 15 May 1996 09:40:53 -0700
   From: nn@lanta.engr.sgi.com (Neal Nuckolls)

   > sun4m SS10 115mhz hypersparc, 256k cache
   > measure_csum_partial
   > csum_partial: sz[1024] 10000 iterations takes 17009430 microseconds
   > csum_partial: sz[1024] 1 iteration takes <1700 microseconds>==<332 nanoseconds>

   Maybe I'm dense this morning but I don't understand the numbers.

[NOTE: real lmbench results with the new checksum code in a bit, it's
       not as much of an improvement as I wanted and Solaris still
       gets better TCP bandwidth on localhost.]

The benchmark looks like:

	for(iter=0; iter < 10000; iter++) {
		for(inner=0; inner < 512; inner++) {
			csum(foo, bar, baz);
			flush_caches();
		}
	}

The 512 number is just imperical because for small buffers (less than
1k) doing just one iteration caused it impossible to measure anything
significant.

I am factoring in the time the cache flush takes.  I calculate how
long it takes to do the flush before the loop runs, then subtract that
value multiplied by (iter * inner) from the final time.

So the 17009430 microseconds is the time it takes to run the entire
loop structure minus the flushing overhead.

1700 microseconds is the time each run of the inner for loop took
again minus the flush overhead, 332 nanoseconds is the absolute time
each instance of the csum() took to run on the buffer once again this
is after subtracting the flush overhead.

Later,
David S. Miller
davem@caip.rutgers.edu

From owner-linux@cthulhu.engr.sgi.com  Thu May 16 00:04:19 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id AAA12368; Thu, 16 May 1996 00:04:18 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id HAA22629 for linux-list; Thu, 16 May 1996 07:02:54 GMT
Received: from sgi.sgi.com (sgi.engr.sgi.com [150.166.76.30]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id AAA22622 for <linux@engr.sgi.com>; Thu, 16 May 1996 00:02:53 -0700
Received: from kyoko.mpx.com.au by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	for <linux@engr.sgi.com> id AAA15488; Thu, 16 May 1996 00:02:47 -0700
Received: from gatekeeper.awa.com.au(really [203.10.64.114]) by kyoko.mpx.com.au
	via smail with smtp
	id <m0uJx58-0006NOC@kyoko.mpx.com.au>
	for <linux@engr.sgi.com>; Thu, 16 May 96 17:02:42 +1000 (EST)
	(/\##/\ Smail3.1.30.13 #30.8 built 5-oct-95)
Received: from desertoak.awa.com.au(really [150.207.60.100]) by gatekeeper.awa.com.au
	via sendmail with esmtp
	id <m0uINZw-000RiKC@gatekeeper.awa.com.au>
	for <linux@engr.sgi.com>; Sun, 12 May 96 08:56:00 +1000 (AEST)
	(/\##/\ Smail3.1.30.13 #30.16 built 2-may-96)
Received: from cephalotus(really [150.207.24.34]) by desertoak.awa.com.au
	via sendmail with smtp
	id <m0uJx54-001kAjC@desertoak.awa.com.au>
	for <linux@engr.sgi.com>; Thu, 16 May 96 17:02:38 +1000 (AEST)
	(/\##/\ Smail3.1.30.13 #30.11 built 8-may-96)
Message-Id: <m0uJx54-001kAjC@desertoak.awa.com.au>
Date: Thu, 16 May 96 17:02:38 +1000 (AEST)
X-Sender: mbeach@desertoak
X-Mailer: Windows Eudora Light Version 1.5.2
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
To: linux@cthulhu.engr.sgi.com
From: Michael Beach <mbeach@awa.com.au>
Subject: References for MIPS hacking
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

Hello all.

Is anything much happening on this list? I haven't seen any activity for
quite a while? A few real questions too...

Does this port target the R4000 and up, or is it meant to run on R3000s as
well? Also, can anyone point me at the 'canonical reference' for the MIPS
processors? Do MIPS publish programmers reference manuals, or do I have to
hassle their 'technology partners' ie LSI, IDT etc.

Thanks in advance!

Regards
M.Beach


From owner-linux@cthulhu.engr.sgi.com  Thu May 16 00:05:53 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id AAA12432; Thu, 16 May 1996 00:05:53 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id HAA22857 for linux-list; Thu, 16 May 1996 07:04:27 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id AAA22842 for <linux@cthulhu.engr.sgi.com>; Thu, 16 May 1996 00:04:24 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id AAA12382 for <lmlinux@neteng.engr.sgi.com>; Thu, 16 May 1996 00:04:23 -0700
Received: from sgi.sgi.com (sgi.engr.sgi.com [150.166.76.30]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id AAA22827; Thu, 16 May 1996 00:04:21 -0700
Received: from caipfs.rutgers.edu by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	 id AAA15579; Thu, 16 May 1996 00:04:19 -0700
Received: from huahaga.rutgers.edu (huahaga.rutgers.edu [128.6.155.53]) by caipfs.rutgers.edu (8.6.9+bestmx+oldruq+newsunq+grosshack/8.6.9) with ESMTP id CAA22273; Thu, 16 May 1996 02:53:50 -0400
Received: (davem@localhost) by huahaga.rutgers.edu (8.6.9+bestmx+oldruq+newsunq+grosshack/8.6.9) id CAA12098; Thu, 16 May 1996 02:53:50 -0400
Date: Thu, 16 May 1996 02:53:50 -0400
Message-Id: <199605160653.CAA12098@huahaga.rutgers.edu>
From: "David S. Miller" <davem@caip.rutgers.edu>
To: lmlinux@neteng.engr.sgi.com
CC: torvalds@cs.helsinki.fi, sparclinux-cvs@caipfs.rutgers.edu, alan@cymru.net
Subject: lmbench with new checksum code...
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk


It's not as hot as I wanted it to be, ho hum... Just for flavor I've
included numbers for my 115MHZ hypersparc running SunOS so no none
forgets that this is all running on a shitty SparcClassic. ;-)

I'm probably stalling the chip stupidly in my code or not touching the
cache lines in the correct order, ugh this is pissing me off, I'll
check it out some more tomorrow when my head doesn't hurt so much...
I doubt I can get 2mb/s more out of my code to beat solaris, sigh,
where the heck could all this overhead possibly be???

                    L M B E N C H  1 . 0   S U M M A R Y
                    ------------------------------------

            Processor, Processes - times in microseconds
            --------------------------------------------
Host                 OS  Mhz    Null    Null  Simple /bin/sh Mmap 2-proc 8-proc
                             Syscall Process Process Process  lat  ctxsw  ctxsw
--------- ------------- ---- ------- ------- ------- ------- ---- ------ ------
trombetas  Linux 1.99.3   50      15    8.8K   40.9K     75K  350     83    100
geneva.ru     SunOS 5.5   50      31   33.7K  148.2K    274K  596    174    205
negro.rut SunOS 4.1.3_U   49     124   18.3K   63.9K    110K  470    152    262
huahaga.r   SunOS 4.1.4  115      32   10.4K   34.3K     59K  169     96    104

            *Local* Communication latencies in microseconds
            -----------------------------------------------
Host                 OS  Pipe       UDP    RPC/     TCP    RPC/
                                            UDP             TCP
--------- ------------- ------- ------- ------- ------- -------
trombetas  Linux 1.99.3     295    1016    1756    1408    2564
geneva.ru     SunOS 5.5     530    1563    2080    1354    2398
negro.rut SunOS 4.1.3_U     890    1375    2287    1573    2804
huahaga.r   SunOS 4.1.4     306     616     956     667    1161

            *Local* Communication bandwidths in megabytes/second
            ----------------------------------------------------
Host                 OS Pipe  TCP  File   Mmap  Bcopy  Bcopy  Mem   Mem
                                  reread reread (libc) (hand) read write
--------- ------------- ---- ---- ------ ------ ------ ------ ---- -----
trombetas  Linux 1.99.3    8  4.8   25.0   17.4     18     24   41    36
geneva.ru     SunOS 5.5    8  7.0   12.6   19.5     18     18   40    36
negro.rut SunOS 4.1.3_U    4  2.0   19.5    8.2     18     24   41    36
huahaga.r   SunOS 4.1.4   14  5.3   30.2   19.9     20     22   48    37

            Memory latencies in nanoseconds
            (WARNING - may not be correct, check graphs)
            --------------------------------------------
Host                 OS   Mhz  L1 $   L2 $    Main mem    TLB    Guesses
--------- -------------   ---  ----   ----    --------    ---    -------
trombetas  Linux 1.99.3    50    20    170         180    600    No L2 cache?
geneva.ru     SunOS 5.5    49     -      -           -      -    Bad mhz?
negro.rut SunOS 4.1.3_U    49    20    175         183    659    No L2 cache?
huahaga.r   SunOS 4.1.4   115    17     17         510    842    No L1 cache?

From owner-linux@cthulhu.engr.sgi.com  Thu May 16 02:21:25 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id CAA17152; Thu, 16 May 1996 02:21:25 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id JAA01804 for linux-list; Thu, 16 May 1996 09:20:00 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id CAA01794 for <linux@cthulhu.engr.sgi.com>; Thu, 16 May 1996 02:19:57 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id CAA17105 for <linux@neteng.engr.sgi.com>; Thu, 16 May 1996 02:19:57 -0700
Received: from sgi.sgi.com (sgi.engr.sgi.com [150.166.76.30]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id CAA01784; Thu, 16 May 1996 02:19:54 -0700
Received: from snowcrash.cymru.net by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	 id CAA21099; Thu, 16 May 1996 02:19:50 -0700
Received: (from alan@localhost) by snowcrash.cymru.net (8.7.1/8.7.1) id KAA08950; Thu, 16 May 1996 10:09:19 +0100
From: Alan Cox <alan@cymru.net>
Message-Id: <199605160909.KAA08950@snowcrash.cymru.net>
Subject: Re: mpp kernel interface
To: lm@gate1-neteng.engr.sgi.com (Larry McVoy)
Date: Thu, 16 May 1996 10:09:17 +0100 (BST)
Cc: tridge@cs.anu.edu.au, linux-mc@arvidsjaur.anu.edu.au,
        Linus.Torvalds@cs.Helsinki.FI, linux@neteng.engr.sgi.com,
        alan@cymru.net
In-Reply-To: <199605152243.PAA19394@neteng.engr.sgi.com> from "Larry McVoy" at May 15, 96 03:43:19 pm
X-Mailer: ELM [version 2.4 PL24]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

> : With 30 people now on this list there must be someone else who wants
> : to say something ...

I've not actually seen anything from the list (should be going to ukuu)

> 	SMP vs cluster?

Why vs, why not AND. An SMP box is just a couple of well connected nodes.

> The hard one is sockets.  I've never seen a good solution to that.
> I'll come back to that.  First, I want to go through the other ones and
> offer suggestions.

There are several approaches to this. One is to treat it like a device so
you open a socket on a given machine and it like the device doesnt walk. The
other is to bend IP mobile to the needs. That would probably want something
like an old 486 as a network FEP.

> do remote processes.  For the PID name space, make pid_t a 32 bit int,
> make the top 16 bits the host part, and the bottom 16 bits the pid part.
> (We need to come back to the host part when we discuss process migration.)
> A host part of "0" means "this host".  So a "kill -HUP 1" will always
> restart init.

We can also do this for devices so you can mknod a printer on a different 
node.

> Devices I sort of punt on.  For device access, I would just use the 
> remote mag tape protocol (or something very, very similar) so that all
> of the locking, etc., still works - since you ship all the requests to
> the system w/ the device, that kernel can implement the locks.  Any
> issues here?

The vfs issues fairly controlled requests to the FS layer, and the device
layer also is fairly clean. The MOSIX project intercepted stuff at this
level --- so a remote device turns the request into a message. The system
also accounted messages so a process like a find would migrate across cpus
as it changed the disk it was searching.

> we have the talent right here on this list to do it.  So I'll bow out of
> commenting on it, other than to say make sure mmap works right.

mmap is foul. SYS5 shared memory is just as bad too. 

> Sockets.  This is a hard problem.  Some people think that a socket
> should stick around after the CPU that created the socket has crashed.

If you are having a single apparent IP address this is true for TCP, you can
on spotting a down host just send an RST and go into TIME_WAIT to preserve
the corruption protection properties of TCP.

> is a big performance win.  Coming from a network cluster, you'll get that
> without having to work for it - the other way frequently is harder.

It also means more people can play


From owner-linux@cthulhu.engr.sgi.com  Thu May 16 02:45:50 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id CAA17955; Thu, 16 May 1996 02:45:50 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id JAA02565 for linux-list; Thu, 16 May 1996 09:44:21 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id CAA02560 for <linux@cthulhu.engr.sgi.com>; Thu, 16 May 1996 02:44:19 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id CAA17903 for <lmlinux@neteng.engr.sgi.com>; Thu, 16 May 1996 02:44:18 -0700
Received: from sgi.sgi.com (sgi.engr.sgi.com [150.166.76.30]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id CAA02551 for <lmlinux@neteng.engr.sgi.com>; Thu, 16 May 1996 02:44:17 -0700
Received: from snowcrash.cymru.net by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	for <lmlinux@neteng.engr.sgi.com> id CAA22098; Thu, 16 May 1996 02:44:14 -0700
Received: (from alan@localhost) by snowcrash.cymru.net (8.7.1/8.7.1) id KAA09812; Thu, 16 May 1996 10:35:05 +0100
From: Alan Cox <alan@cymru.net>
Message-Id: <199605160935.KAA09812@snowcrash.cymru.net>
Subject: Re: lmbench with new checksum code...
To: davem@caip.rutgers.edu (David S. Miller)
Date: Thu, 16 May 1996 10:35:04 +0100 (BST)
Cc: lmlinux@neteng.engr.sgi.com, torvalds@cs.helsinki.fi,
        sparclinux-cvs@caipfs.rutgers.edu, alan@cymru.net
In-Reply-To: <199605160653.CAA12098@huahaga.rutgers.edu> from "David S. Miller" at May 16, 96 02:53:50 am
X-Mailer: ELM [version 2.4 PL24]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

> I doubt I can get 2mb/s more out of my code to beat solaris, sigh,
> where the heck could all this overhead possibly be???

What makes you think the Solaris loopback is even doing checksums or a memcpy
via kernel space ? They only way you'll beat solaris at the loopback network
game is to cheat as they do. Make tcp_connect spot a localhost connection
change the socket method to something akin to af_unix but streamlined a bit
(only special case is urgent data).

Alan


From owner-linux@cthulhu.engr.sgi.com  Thu May 16 03:10:12 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id DAA18984; Thu, 16 May 1996 03:10:12 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id KAA03443 for linux-list; Thu, 16 May 1996 10:08:46 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id DAA03438 for <linux@cthulhu.engr.sgi.com>; Thu, 16 May 1996 03:08:44 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id DAA18931 for <lmlinux@neteng.engr.sgi.com>; Thu, 16 May 1996 03:08:42 -0700
Received: from sgi.sgi.com (sgi.engr.sgi.com [150.166.76.30]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id DAA03430 for <lmlinux@neteng.engr.sgi.com>; Thu, 16 May 1996 03:08:41 -0700
Received: from porsta.cs.Helsinki.FI by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	for <lmlinux@neteng.engr.sgi.com> id DAA22956; Thu, 16 May 1996 03:08:39 -0700
Received: from linux.cs.Helsinki.FI (linux.cs.Helsinki.FI [128.214.48.39]) by porsta.cs.Helsinki.FI (8.6.10/8.6.9) with SMTP id NAA19285; Thu, 16 May 1996 13:08:28 +0300
Date: Thu, 16 May 1996 13:08:04 +0300 (EET DST)
From: Linus Torvalds <torvalds@cs.Helsinki.FI>
To: Alan Cox <alan@cymru.net>
cc: "David S. Miller" <davem@caip.rutgers.edu>, lmlinux@neteng.engr.sgi.com,
        sparclinux-cvs@caipfs.rutgers.edu
Subject: Re: lmbench with new checksum code...
In-Reply-To: <199605160935.KAA09812@snowcrash.cymru.net>
Message-ID: <Pine.LNX.3.91.960516124452.6325L-100000@linux.cs.Helsinki.FI>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk



On Thu, 16 May 1996, Alan Cox wrote:
> 
> What makes you think the Solaris loopback is even doing checksums or a memcpy
> via kernel space ? They only way you'll beat solaris at the loopback network
> game is to cheat as they do. Make tcp_connect spot a localhost connection
> change the socket method to something akin to af_unix but streamlined a bit
> (only special case is urgent data).

One day we might actually want to do something like that. No, I'm not
suggesting special-casing loopback (we don't need it any more, we're getting
good enough performance as-is), but I would suggest that at some point we
integrate the networking code a bit tighter in the VFS model of open/close. 

One problem with the networking code right now is that we can't short-circuit
some of the decisions, so we're doing unnecessary run-time checks. This is
not really much of a performance problem, I consider it more of a beaty wart
right now. 

For example, when we open a normal pipe (be it unnamed or named), we 
never go through the filesystem code for that pipe any more - we just 
make the file operations point directly to the pipe code, and when we 
read/write to that pipe we don't have any filesystem overhead.

In contrast, when we open a network connection, we always go through the 
network layer, even at run-time. Admittedly there are some reasons for 
this, but most of them aren't really valid any more.

For example, sockets used to really be totally separate entities from 
inodes, so we couldn't consider sockets to be part of the VFS layer. But 
that isn't true any more (and hasn't been for a long whole): sockets 
really _are_ implemented as a part of the inode these days. So a "socket" 
is really just the network-specific part of an inode (the same way the 
"normal" filesystems have their own private parts - look at the inode "u" 
union to see what I mean.

However, due to historical reasons, the internal socket routines do not 
use the VFS "inode" abstraction, but instead they use only the socket 
sub-part. Sometime in the future I would really like to get rid of that, 
and make the low-level socket code use the "inode" the same way everybody 
else does.

This is _definitely_ not a huge issue - as I said it's more cleanlyness 
and encapsulation than performance. It will require making the 
socket-specific IO operations (recvmsg etc) be first-class members in the 
VFS layer etc, and in general it requires a lot of minor modifications. 
Nothing terribly hard (and some things get cleaner thanks to it), but 
it's a lot of code that has to be worked out.

Merging the sockets more tightly into the VFS layer gets rid of the current
"struct socket" that we don't really need (as opposed to the "struct sock",
which is a different beast altogether) and at least partly the "struct
proto_ops" (which would just be part of the "struct file_operations").  We'd
get rid of one (slightly confusing) layer of abstraction, and some cases
could be streamlined a bit. 

Finally, let me say again that I don't think we should short-circuit loopback
TCP like Solaris does. I used to think it was a nifty feature, but I got
better. When I talk about streamlining above, I'm thinking of similar
short-circuiting, but on a much smaller scale (getting rid of run-time checks
that can be done when the state of the thing changes instead). 

For an example of what I'm talking about, look at how the tty layer uses the
operations pointers to handle hangup etc. It just changes the file operations
pointer, which automagically means that the files start behaving differently
once they've been hung up (without having the actual routines do any extra
"am I hung up" checking).  That's the kind of thing that the network layer
could do too if it was better integrated with the VFS layer. 

[ Alan has heard some of this before - I've been talking about these changes
  for a long time. I've never felt they have been important enough to really
  do something radical about it, but I still think it's the right thing to do
  eventually ]

		Linus

From owner-linux@cthulhu.engr.sgi.com  Thu May 16 07:22:31 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id HAA26774; Thu, 16 May 1996 07:22:31 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id OAA14020 for linux-list; Thu, 16 May 1996 14:21:06 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id HAA14015 for <linux@cthulhu.engr.sgi.com>; Thu, 16 May 1996 07:21:04 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id HAA26731 for <linux@neteng.engr.sgi.com>; Thu, 16 May 1996 07:21:04 -0700
Received: from sgi.sgi.com (sgi.engr.sgi.com [150.166.76.30]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id HAA14001; Thu, 16 May 1996 07:21:01 -0700
Received: from arvidsjaur.anu.edu.au by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	 id HAA08955; Thu, 16 May 1996 07:20:58 -0700
Received: (from tridge@localhost) by arvidsjaur.anu.edu.au (8.7.3/8.6.9) id AAA24100; Fri, 17 May 1996 00:20:34 +1000
Date: Fri, 17 May 1996 00:20:34 +1000
Message-Id: <199605161420.AAA24100@arvidsjaur.anu.edu.au>
From: Andrew Tridgell <tridge@cs.anu.edu.au>
To: lm@gate1-neteng.engr.sgi.com
CC: linux-mc@arvidsjaur.anu.edu.au, Linus.Torvalds@cs.Helsinki.FI,
        linux@neteng.engr.sgi.com, alan@cymru.net
In-reply-to: <199605152243.PAA19394@neteng.engr.sgi.com>
	(lm@gate1-neteng.engr.sgi.com)
Subject: Re: mpp kernel interface
Reply-to: Andrew.Tridgell@anu.edu.au
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

Larry,

I think what you describe is really a blueprint for a "throughput"
machine, a machine that gets its parallelism mostly from the fact that
you will be running lots of independent jobs on it at once. The
alternative is a "parallel" machine, which aims to get optimal
performance even when just running one or two programs.

For example, a throughput machine is an ideal departmental
server. Lots of people doing edits/compiles with some heavy computation
thrown in now and then. Its the sort of thing that clusters of
workstations normally do.

A "parallel" machine is what supercomputer centres often have. They
run just one or two jobs at once, but they are big jobs, like climate
modelling or fluid dynamics simulations. They use huge amounts of ram
(many GB) for the one process and require very tight communications
using specialised parallel libraries.

Our research group is really centered around parallel systems, with
algorithms that scale to thousands of cpus. Unfortunately our budget
only stretches to 16 cpus on the AP+ at the moment, but we can also
run on much larger systems like we did this week by connecting to the
"Parallel Computing Research Facility" at Fujitsu in Japan.

This approach has a number of implications:

- we're not as worried about the ability to dynamically enter/leave a
"cluster". This makes algorithms simpler and faster as they can use
data structures that assume that the number of cpus and their layout
stays static.

- we're not as worried about recovering/continuing if a cpu
crashes. If all user jobs are running on all cpus then it doesn't make
much sense to try to recover when one goes down, as it kills all user
jobs anyway, so you might as well shutdown (crash), replace the part
and startup again. Its not as though this is a common experience for
us anyway. I don't think we've had a hardware fault on our 128 cpu
ap1000 yet, after several years of operation.

- we're worried about getting the very best bandwidth/latency out of
the communications network, while still providing all the lovely
operating system services that Linux provides.

- we're worried about providing efficient parallel filesystem, memory
and networking abstractions, scalable to lots of processors.

> Processes have two major chunks of work, the PID name space and how you
> do remote processes.  For the PID name space, make pid_t a 32 bit int,
> make the top 16 bits the host part, and the bottom 16 bits the pid part.
> (We need to come back to the host part when we discuss process migration.)
> A host part of "0" means "this host".  So a "kill -HUP 1" will always
> restart init.

ok, this makes sense with what we've done so far, which is really just
a tightly integrated network of workstations. I'm not at all sure its
what we want in the long term however. It assumes that things like
init will be running on every cpu, so that you have to distinguish
which copy of init you mean when you send it a signal.

I'm hoping that we will eventually have a really "single system image"
machine, where only one copy of init is actually running. Most cpus
would not be running any system daemons at all, just the necessary
kernel threads. 

Right now we do in fact have one copy of init on each cpu, along with
lots of other daemons. We can get away with them all having the same
pid because the system isn't really parallel yet, there is no notion
of a remote syscall. (we have in fact done a remote signal send
operation for parallel programs, but its not as general as a remote
kill)

> Devices I sort of punt on.  For device access, I would just use the 
> remote mag tape protocol (or something very, very similar) so that all
> of the locking, etc., still works - since you ship all the requests to
> the system w/ the device, that kernel can implement the locks.  Any
> issues here?

mostly speed issues. Could using something like rmt really scale to
lots of processors with reasonable performance?
 
> Files.  I have also punted on this.  I have never gotten that excited about
> a cache coherent distributed file system, though others certainly do.  It's
> not because I don't think it is useful.  I believe it can be done and that
> we have the talent right here on this list to do it.  So I'll bow out of
> commenting on it, other than to say make sure mmap works right.

The big problems are indeed cache coherency and things like mmap(). We
implemented a parallel filesystem called HiDIOS on the 128 node ap1000
which worked by putting the buffer cache for a block on the cell that
owns the block. We didn't support mmap() as the machine had no mmu,
and the only cache local to each cpu is an optional one controlled by
the user in much the same way as stdio, but applied to file
descriptors in the C library.

We were able to get away with this because the remote memory access
bandwidth was very high (slightly higher in fact than local memory),
and latencies were low. We also used a really simple meta data
structure that completely elimated indirect blocks to find data on the
disk. We got 60MB/sec through the filesystem (the hardware limit is
64MB/sec).

We've seen attempts to put standard unix filesystem structures on
parallel machines and they are just not efficient enough. The cost of
manipulating all that meta data is huge when the disks (in parallel)
are capable of higher thoughputs than a local memory copy.

We are planning on doing a similar parallel filesystem for
Linux/AP+. We'll use a more sophisticated meta data scheme than was
used in the AP1000 HiDIOS, but still much simpler than that used in
ext2. It will be tuned for big io tasks, like loading 2GB of data
before the start of a heavy computing task. It will probably be
pathetic for loading your .cshrc, but we aren't going to be
encouraging people to use this system to run shell scripts on :-)

I still don't know how we are going to handle mmap. We think mmap is
very important in a parallel filesystem, but we just don't know how to
implement it in a really fast and coherent way yet.
 
> Sockets.  This is a hard problem.  Some people think that a socket
> should stick around after the CPU that created the socket has crashed.

yep, sockets are probably hard. We've already met problems with
them and we haven't even tried to make them parallel yet :-)

We use sockets to implement the stdin/stdout/stderr of parallel
processes. The paralleld that launches parallel programs on each cell
first creates 3 sockets back to the launching program, setting them up
as file desciptors 0, 1 and 2. When it then does a fork()/exec() the
parallel program inherits them.

The problem is that on a 64 cell machine we end up having 192 sockets
connected to the one process on the front end that launched the
parallel program. This is madness! It also won't work well if we have
1024 cpus :-)

I'm hoping this problem will go away when we revamp the way we launch
parallel programs. If we had a remote fork() and/or remote exec()
and also had a way for the file descriptors of remote forked processes
to feed back into the parent cpu then it would be much better. 

We'd probably also need to use a tree structure to feed the file
descriptors (and paging for that matter) back up into the parent
process. 1000 children all writing to one parent would not be pretty. 

> Cluster {join,leave}  This turns out to be a thorny area.  You gotta get it
> right, too.  You want the cluster to keep working in the face of a crashed
> node.  You also want nodes to be taken out and added back into the cluster
> for whatever reasons.  There's a whole set of protocol issues here that I'm 
> too sick to describe, trust me that we have a lot of work in this area.

This is a really interesting area to work on, but its probably not
something our group will be looking at soon, for the reasons described
earlier. We've got to focus our attentions a bit :-)
 
> Finally - when doing all of this stuff, please do a 100Mbit ethernet based
> version as well as the AP version.  If you come at it from a network point
> of view, a whole bunch of problems will _not_ happen in the AP version.
> When you have all that nice hardware, you tend to use it and it can
> screw up the architecture such that a network based cluster won't work.
> On the other hand, if you do a network based cluster, you are guarenteed
> that you have a partioned solution.  As you try and make all of those
> kernels work on that one big shared memory, you'll find that partitioning
> is a big performance win.  Coming from a network cluster, you'll get that
> without having to work for it - the other way frequently is harder.

We probably won't do a 100Mbit version ourselves, we just don't have
the time. We'd love to cooperate with people that are doing it,
however, and I hope that a lot of what we do will be relevant for
systems like that.

The problem is really latency. Ethernet type systems have latencies
which aren't much lower than the system clock tick interval. This
means it often makes sense to do things is quite different ways to
what we have to do.

Cheers, Andrew


From owner-linux@cthulhu.engr.sgi.com  Thu May 16 09:05:05 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id JAA03283; Thu, 16 May 1996 09:05:04 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id QAA22884 for linux-list; Thu, 16 May 1996 16:03:40 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id JAA22868 for <linux@cthulhu.engr.sgi.com>; Thu, 16 May 1996 09:03:37 -0700
Received: from localhost (lm@localhost) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via SMTP id JAA03193; Thu, 16 May 1996 09:03:36 -0700
Message-Id: <199605161603.JAA03193@neteng.engr.sgi.com>
To: "David S. Miller" <davem@caip.rutgers.edu>
From: lm@gate1-neteng.engr.sgi.com (Larry McVoy)
cc: lmlinux@neteng.engr.sgi.com, torvalds@cs.helsinki.fi,
        sparclinux-cvs@caipfs.rutgers.edu, alan@cymru.net
Subject: Re: lmbench with new checksum code... 
Date: Thu, 16 May 1996 09:03:36 -0700
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

:             *Local* Communication bandwidths in megabytes/second
:             ----------------------------------------------------
: Host                 OS Pipe  TCP  File   Mmap  Bcopy  Bcopy  Mem   Mem
:                                   reread reread (libc) (hand) read write
: --------- ------------- ---- ---- ------ ------ ------ ------ ---- -----
: trombetas  Linux 1.99.3    8  4.8   25.0   17.4     18     24   41    36
: geneva.ru     SunOS 5.5    8  7.0   12.6   19.5     18     18   40    36
: negro.rut SunOS 4.1.3_U    4  2.0   19.5    8.2     18     24   41    36

THis is sorta weird - why is it that mmap reread is slower than file reread?
Do you have a kernel bcopy that is faster than memory read?

I think your 4.8MB/sec number is pretty studly.  That means you are 
checksumming 9MB/sec as well as the protocol stack on a system that 
bcopies at ~20MB/sec.  You're already better than 2x the SunOS code.
Call it a day, you won.

From owner-linux@cthulhu.engr.sgi.com  Thu May 16 09:05:29 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id JAA03300; Thu, 16 May 1996 09:05:29 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id QAA23005 for linux-list; Thu, 16 May 1996 16:04:04 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id JAA22994 for <linux@cthulhu.engr.sgi.com>; Thu, 16 May 1996 09:04:03 -0700
Received: from xtp.engr.sgi.com (xtp.engr.sgi.com [150.166.75.34]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id JAA03222 for <lmlinux@neteng.engr.sgi.com>; Thu, 16 May 1996 09:04:02 -0700
Received: by xtp.engr.sgi.com (940816.SGI.8.6.9/911001.SGI)
	 id JAA06415; Thu, 16 May 1996 09:04:00 -0700
From: "Greg Chesson" <greg@xtp.engr.sgi.com>
Message-Id: <9605160903.ZM6413@xtp.engr.sgi.com>
Date: Thu, 16 May 1996 09:03:58 -0700
In-Reply-To: Linus Torvalds <torvalds@cs.Helsinki.FI>
        "Re: lmbench with new checksum code..." (May 16,  1:08pm)
References: <Pine.LNX.3.91.960516124452.6325L-100000@linux.cs.Helsinki.FI>
X-Mailer: Z-Mail (3.2.0 26oct94 MediaMail)
To: Linus Torvalds <torvalds@cs.Helsinki.FI>, Alan Cox <alan@cymru.net>
Subject: Re: lmbench with new checksum code...
Cc: "David S. Miller" <davem@caip.rutgers.edu>, lmlinux@neteng.engr.sgi.com,
        sparclinux-cvs@caipfs.rutgers.edu
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

folks have special-cased the loopback driver in the past to improve
X performance.

g

From owner-linux@cthulhu.engr.sgi.com  Thu May 16 12:05:09 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id MAA05147; Thu, 16 May 1996 12:05:08 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id TAA22974 for linux-list; Thu, 16 May 1996 19:03:42 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id MAA22942 for <linux@cthulhu.engr.sgi.com>; Thu, 16 May 1996 12:03:39 -0700
Received: from lanta.engr.sgi.com (lanta.engr.sgi.com [192.26.75.15]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id MAA05094 for <lmlinux@neteng.engr.sgi.com>; Thu, 16 May 1996 12:03:38 -0700
Received: by lanta.engr.sgi.com (940816.SGI.8.6.9/911001.SGI)
	 id MAA13605; Thu, 16 May 1996 12:03:35 -0700
Date: Thu, 16 May 1996 12:03:35 -0700
From: nn@lanta.engr.sgi.com (Neal Nuckolls)
Message-Id: <199605161903.MAA13605@lanta.engr.sgi.com>
To: Alan Cox <alan@cymru.net>
Subject: Re: lmbench with new checksum code...
Cc: sparclinux-cvs@caipfs.rutgers.edu, torvalds@cs.helsinki.fi,
        lmlinux@neteng.engr.sgi.com
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

Alan is right, the Solaris tcp/ip implementation has a special case
for loopback which skims just the top of the IP module, no checksum,
no copy at that level, large mtu.

neal

From owner-linux@cthulhu.engr.sgi.com  Thu May 16 15:27:34 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id PAA15946; Thu, 16 May 1996 15:27:34 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id WAA29186 for linux-list; Thu, 16 May 1996 22:26:09 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id PAA29178 for <linux@cthulhu.engr.sgi.com>; Thu, 16 May 1996 15:26:08 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id PAA15897 for <lmlinux@neteng.engr.sgi.com>; Thu, 16 May 1996 15:26:06 -0700
Received: from sgi.sgi.com (sgi.engr.sgi.com [150.166.76.30]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id PAA29164 for <lmlinux@neteng.engr.sgi.com>; Thu, 16 May 1996 15:26:05 -0700
Received: from snowcrash.cymru.net by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	for <lmlinux@neteng.engr.sgi.com> id PAA21591; Thu, 16 May 1996 15:25:59 -0700
Received: from lxorguk.ukuu.org.uk (Ulxorguk@localhost) by snowcrash.cymru.net (8.7.1/8.7.1) with UUCP id XAA01922 for neteng.engr.sgi.com!lmlinux; Thu, 16 May 1996 23:12:14 +0100
Received: by lightning.swansea.linux.org.uk (Smail3.1.29.1 #2)
	id m0uK6ux-0005FbC; Thu, 16 May 96 18:32 BST
Message-Id: <m0uK6ux-0005FbC@lightning.swansea.linux.org.uk>
From: alan@lxorguk.ukuu.org.uk (Alan Cox)
Subject: Re: lmbench with new checksum code...
To: lm@gate1-neteng.engr.sgi.com (Larry McVoy)
Date: Thu, 16 May 1996 18:32:51 +0100 (BST)
Cc: davem@caip.rutgers.edu, lmlinux@neteng.engr.sgi.com,
        torvalds@cs.Helsinki.FI, sparclinux-cvs@caipfs.rutgers.edu,
        alan@cymru.net
In-Reply-To: <199605161603.JAA03193@neteng.engr.sgi.com> from "Larry McVoy" at May 16, 96 09:03:36 am
Content-Type: text
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

> I think your 4.8MB/sec number is pretty studly.  That means you are 
> checksumming 9MB/sec as well as the protocol stack on a system that 

Actually the Linux kernel cheats on the receive side of loopback and doesnt
checksum. Its too expensive to fiddle around on the send side for that short
of doing the whole job and shorting the tcp layer as Solaris seems to.

> bcopies at ~20MB/sec.  You're already better than 2x the SunOS code.
> Call it a day, you won.

Na.. We can cut all the checksums, do raw copies with no protocol overhead
and possibly later on do user->user copying (with page flips to keep larry
happy ;)). I can't see any reason we cannot do about 8-9MB/sec with a bit
of extra code and some sneaky tricks.

CONFIG_TCP_BENCHMARK_TRICKS 

anyone ?



From owner-linux@cthulhu.engr.sgi.com  Thu May 16 15:28:24 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id PAA15968; Thu, 16 May 1996 15:28:23 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id WAA29295 for linux-list; Thu, 16 May 1996 22:26:59 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id PAA29290 for <linux@cthulhu.engr.sgi.com>; Thu, 16 May 1996 15:26:58 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id PAA15928 for <linux@neteng.engr.sgi.com>; Thu, 16 May 1996 15:26:57 -0700
Received: from sgi.sgi.com (sgi.engr.sgi.com [150.166.76.30]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id PAA29283 for <linux@neteng.engr.sgi.com>; Thu, 16 May 1996 15:26:56 -0700
Received: from snowcrash.cymru.net by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	for <linux@neteng.engr.sgi.com> id PAA21745; Thu, 16 May 1996 15:26:49 -0700
Received: from lxorguk.ukuu.org.uk (Ulxorguk@localhost) by snowcrash.cymru.net (8.7.1/8.7.1) with UUCP id XAA01905 for neteng.engr.sgi.com!linux; Thu, 16 May 1996 23:11:50 +0100
Received: by lightning.swansea.linux.org.uk (Smail3.1.29.1 #2)
	id m0uK6ho-0005FbC; Thu, 16 May 96 18:19 BST
Message-Id: <m0uK6ho-0005FbC@lightning.swansea.linux.org.uk>
From: alan@lxorguk.ukuu.org.uk (Alan Cox)
Subject: Re: mpp kernel interface
To: Andrew.Tridgell@anu.edu.au
Date: Thu, 16 May 1996 18:19:16 +0100 (BST)
Cc: lm@gate1-neteng.engr.sgi.com, linux-mc@arvidsjaur.anu.edu.au,
        Linus.Torvalds@cs.Helsinki.FI, linux@neteng.engr.sgi.com,
        alan@cymru.net
In-Reply-To: <199605161420.AAA24100@arvidsjaur.anu.edu.au> from "Andrew Tridgell" at May 17, 96 00:20:34 am
Content-Type: text
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

> We use sockets to implement the stdin/stdout/stderr of parallel
> processes. The paralleld that launches parallel programs on each cell
> first creates 3 sockets back to the launching program, setting them up
> as file desciptors 0, 1 and 2. When it then does a fork()/exec() the
> parallel program inherits them.

Two things strike me here. Firstly if you are doing that kind of output
redirection across 192 cells you are going to need 192 logical connections
however you do it. Secondly you really want your node end library to be
a bit smarter and pass a tty check across the link so you can use tty/pty
pairs if the real descriptor is a tty.

> parallel programs. If we had a remote fork() and/or remote exec()
> and also had a way for the file descriptors of remote forked processes
> to feed back into the parent cpu then it would be much better. 

MOSIX does this by trapping them at the VFS layer. Effectively each inode
and file handle has a host field and if the operation is remote you RPC.

> We'd probably also need to use a tree structure to feed the file
> descriptors (and paging for that matter) back up into the parent
> process. 1000 children all writing to one parent would not be pretty. 

It would be an interesting application of multicast groups to allow the parent
to roam as well. With 1000 children thats an even bigger scaling problem, and
for sending stuff to a large number of nodes (eg a loosely synchronized SIMD
job) its going to be needed.

> The problem is really latency. Ethernet type systems have latencies
> which aren't much lower than the system clock tick interval. This
> means it often makes sense to do things is quite different ways to
> what we have to do.

Yes. The latency also means that attacking from two other angles is interesting
Firstly 10Mbit ethernet - latency is no worse really just we have to be more
reluctant to bulk copy data, and also combining it with something like the
TTL PAPERS device for the fast sync stuff (its a $60 to build parallel port
synchronization system with about a 3uS overhead). Very limited but might
solve some of our problems on ethernet linked boxes.

Alan


From owner-linux@cthulhu.engr.sgi.com  Thu May 16 21:01:19 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id VAA23805; Thu, 16 May 1996 21:01:18 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id EAA14641 for linux-list; Fri, 17 May 1996 04:01:14 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id VAA14636 for <linux@cthulhu.engr.sgi.com>; Thu, 16 May 1996 21:01:12 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id VAA23798 for <lmlinux@neteng.engr.sgi.com>; Thu, 16 May 1996 21:01:11 -0700
Received: from sgi.sgi.com (sgi.engr.sgi.com [150.166.76.30]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id VAA14628; Thu, 16 May 1996 21:01:10 -0700
Received: from caipfs.rutgers.edu by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	 id VAA25665; Thu, 16 May 1996 21:01:08 -0700
Received: from huahaga.rutgers.edu (huahaga.rutgers.edu [128.6.155.53]) by caipfs.rutgers.edu (8.6.9+bestmx+oldruq+newsunq+grosshack/8.6.9) with ESMTP id AAA20410; Fri, 17 May 1996 00:01:06 -0400
Received: (davem@localhost) by huahaga.rutgers.edu (8.6.9+bestmx+oldruq+newsunq+grosshack/8.6.9) id AAA19770; Fri, 17 May 1996 00:01:05 -0400
Date: Fri, 17 May 1996 00:01:05 -0400
Message-Id: <199605170401.AAA19770@huahaga.rutgers.edu>
From: "David S. Miller" <davem@caip.rutgers.edu>
To: lm@gate1-neteng.engr.sgi.com
CC: lmlinux@neteng.engr.sgi.com, torvalds@cs.helsinki.fi,
        sparclinux-cvs@caipfs.rutgers.edu, alan@cymru.net
In-reply-to: <199605161603.JAA03193@neteng.engr.sgi.com>
	(lm@gate1-neteng.engr.sgi.com)
Subject: Re: lmbench with new checksum code...
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

   From: lm@gate1-neteng.engr.sgi.com (Larry McVoy)
   Date: Thu, 16 May 1996 09:03:36 -0700

   I think your 4.8MB/sec number is pretty studly.  That means you are 
   checksumming 9MB/sec as well as the protocol stack on a system that 
   bcopies at ~20MB/sec.  You're already better than 2x the SunOS code.
   Call it a day, you won.

I did not win, this is unacceptable.  I am completely convinced based
upon the edge I have over Solaris for context switching and general
process/trap operations I should be able to match it even with
everything going through real networking.

Linux + full networking overhead == Solaris memcpy/cow-page overhead

I cannot accept this piss poor performance I'm getting, it must be
made faster.

(Yes, I'm rediculious, Larry will tell you others how I feel that if
 I am presented with a "next generation" cpu and I cannot get from
 trap entry to kernel c-code in 12 completely pipelined non-stalling
 instructions then some hardware engineer has completely wasted his
 precious time ;-)

Later,
David S. Miller
davem@caip.rutgers.edu

From owner-linux@cthulhu.engr.sgi.com  Sat May 18 22:45:33 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id WAA29009; Sat, 18 May 1996 22:45:32 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id FAA16148 for linux-list; Sun, 19 May 1996 05:45:26 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id WAA16130 for <linux@cthulhu.engr.sgi.com>; Sat, 18 May 1996 22:45:24 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id WAA29006 for <lmlinux@neteng.engr.sgi.com>; Sat, 18 May 1996 22:45:22 -0700
Received: from sgi.sgi.com (sgi.engr.sgi.com [150.166.76.30]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id WAA16118 for <lmlinux@neteng.engr.sgi.com>; Sat, 18 May 1996 22:45:21 -0700
Received: from caipfs.rutgers.edu by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	for <lmlinux@neteng.engr.sgi.com> id WAA22054; Sat, 18 May 1996 22:45:20 -0700
Received: from huahaga.rutgers.edu (huahaga.rutgers.edu [128.6.155.53]) by caipfs.rutgers.edu (8.6.9+bestmx+oldruq+newsunq+grosshack/8.6.9) with ESMTP id BAA12139; Sun, 19 May 1996 01:45:18 -0400
Received: (davem@localhost) by huahaga.rutgers.edu (8.6.9+bestmx+oldruq+newsunq+grosshack/8.6.9) id BAA21316; Sun, 19 May 1996 01:45:17 -0400
Date: Sun, 19 May 1996 01:45:17 -0400
Message-Id: <199605190545.BAA21316@huahaga.rutgers.edu>
From: "David S. Miller" <davem@caip.rutgers.edu>
To: ecd@skynet.be
CC: lmlinux@neteng.engr.sgi.com, sparclinux-cvs@vger.rutgers.edu,
        alan@cymru.net, torvalds@cs.helsinki.fi
Subject: idea for csum_partial_copy on Viking/MXCC
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk


(Note: This is just another one of my crazy ideas, consider this
 something to do possibly in the future when someone has tons of
 copious free time.  For now I'm going to get the software version
 working as fast as it can.)

(Some background for some of you, Viking/MXCC Sparc has a hardware
 block copy facility which can copy cache sub-block aligned chunks of
 ram very quickly.)

This is silly, but it would get disgustingly fast numbers. (btw,
eddie, still waiting for the memcpy.s of yours so that I can do some
testing tonight...)

You use the MXCC stream copy stuff if you have a buffer bigger than
256 bytes and you can align it to 32 bytes.  The unrolled loops right
now look like:

	ldd	[%src + offset + 0x18], %t0;	! multi-cycle cache stall
	ldd	[%src + offset + 0x10], %t2;	! 1 cycle, cache hit
	ldd	[%src + offset + 0x08], %t4;	! 1 cycle, cache hit
	st	%t0, [%dest + offset + 0x18];	! multi-cycle cache stall
	addxcc	%t0, %accum, %accum;		! 1 cycle, does not pair
	st	%t1, [%dest + offset + 0x1c];
	addxcc	%t1, %accum, %accum;		! 1 cycle, cache hit
	st	%t2, [%dest + offset + 0x10];
	addxcc	%t2, %accum, %accum;		! 1 cycle, cache hit
	ldd	[%src + offset + 0x00], %t0;	! 1 cycle, cache hit
	st	%t3, [%dest + offset + 0x14];	! 1 cycle, cache hit, cannot pair
	addxcc	%t3, %accum, %accum;
	st	%t4, [%dest + offset + 0x08];	! 1 cycle, cache hit
	addxcc	%t4, %accum, %accum;
	st	%t5, [%dest + offset + 0x0c];	! 1 cycle, cache hit
	addxcc	%t5, %accum, %accum;
	st	%t0, [%dest + offset + 0x00];	! 1 cycle, cache hit
	addxcc	%t0, %accum, %accum;
	st	%t1, [%dest + offset + 0x04];	! 1 cycle, cache hit
	addxcc	%t1, %accum, %accum;

						! around 19 clock cycles

Bite me, those stores make this stuff impossible to schedule without
grabbing a register window which I refuse to do.

Ok, on the MXCC you eat some cycles so that you have the registers
setup for the source (for the checksum calculations) and the page
numbers etc. for the stream operation for the entire chunk being
csum/copied.  Then it looks like this:

	st	%stream_addr1, [%stream_addr2] ASI_MXCC
	/* Processor stalls 3 or 4 clocks to get stream operation going. */

	ldd	[%src + offset + 0x18], %t0;	! cache hit
	addxcc	%t0, %accum, %accum;		! 1 cycle, does pair
	addxcc	%t1, %accum, %accum;		! 1 cycle, no pair
	ldd	[%src + offset + 0x10], %t2;	! cache hit
	addxcc	%t2, %accum, %accum;		! 1 cycle, does pair
	addxcc	%t3, %accum, %accum;		! 1 cycle, no pair
	ldd	[%src + offset + 0x08], %t4;	! cache hit
	addxcc	%t4, %accum, %accum;		! 1 cycle, cache hit
	addxcc	%t5, %accum, %accum;		! 1 cycle, no pair
	ldd	[%src + offset + 0x00], %t0;	! 1 cycle
	addxcc	%t0, %accum, %accum;		! 1 cycle, cache hit
	addxcc	%t1, %accum, %accum;		! 1 cycle, no pair

						! around 12 clock cycles

MXCC does all those ugly and hard to schedule stores for us ;-)  Note
that I could probably schedule that new sequence even better.

Saving of 7 clock cycles for _every_ 32 byte aligned block we csum,
the overhead of setting up for the stream operation is fuzzed away by
the fact that we usually run this thing many times in a row (thus the
"only do optimization  if len >= 256" rule above).

Let's assume in such an implementation that we eat around 13 or 14
cycles getting the registers ready for the stream operation.  Fine,
then after two straight iterations of the above code sequence we are
breaking even, we commonly run it many times in a row.

Common case for full ethernet frame is that we do 128 bytes at a shot,
11 times.  This works out to:

	(7 clocks saved per iteration * (128 / 32) * 11) -
	(14 stream-op setup cycles * 11 iterations)

	== 308 saved cycles - 154 lost cycles
	== 154 clocks less per csum on ethernet sized packet frame

Old code == 846 total cycles for ethernet sized packet frame
New code == 692 "" ""
	    
We go ~20% faster ;-)  A possible issue is overhead of function
ptr dereference for the call, but based upon the performance of our
dynamic mmu code I doubt it would matter and it would give gcc some
dead cycles to fill in the networking code anyways.

As noted previously, this would be a "research" venture to see what
kind of numbers it would really get.  Now that I think about it I
would be very leery about putting this into the tree so that we don't
hit the sun4d XBUS IOCACHE hardware bug (it is only triggered by MXCC
hardware block copy operations and certain types of dma activity with
a certain set of bit patterns in the data, nasty bug).

For now I'm re-scheduling the software csum/copy code to work as it
should (I was hitting the cache in the wrong way I've found from
Andy's numbers, fixing this right now).

Later,
David S. Miller
davem@caip.rutgers.edu

From owner-linux@cthulhu.engr.sgi.com  Mon May 20 17:17:32 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id RAA02728; Mon, 20 May 1996 17:17:32 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id AAA07173 for linux-list; Tue, 21 May 1996 00:17:24 GMT
Received: from yon.engr.sgi.com (yon.engr.sgi.com [150.166.61.32]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id RAA07156 for <linux@cthulhu.engr.sgi.com>; Mon, 20 May 1996 17:17:23 -0700
Received: (from ariel@localhost) by yon.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id RAA05977 for linux@engr; Mon, 20 May 1996 17:16:59 -0700
From: ariel@yon.engr.sgi.com (Ariel Faigon)
Message-Id: <199605210016.RAA05977@yon.engr.sgi.com>
Subject: Meeting wednesday
To: linux@cthulhu.engr.sgi.com
Date: Mon, 20 May 1996 17:16:59 -0700 (PDT)
Reply-To: ariel@cthulhu.engr.sgi.com
Organization: Silicon Graphics Inc.
X-Mailer: ELM [version 2.4 PL24 ME5a]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length:        525
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

Hi Linuxies

As you may know, David Miller's arival is really close now.
Larry says this Friday...

We reserved Matisse in 9U, for this Wednesday for a one
hour meeting.

The goal is to try and see what we have ready for David
and see how can we improve our preparedness in the last
days that remain.

David: do you have names for the two hosts by now?
Bill: do you have 200MHz CPUs?
I understand we were short on our promises to pre-prepare
stuff since people were really busy, any good news would be nice.
-- 
Peace, Ariel

From owner-linux@cthulhu.engr.sgi.com  Mon May 20 22:42:58 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id WAA19515; Mon, 20 May 1996 22:42:58 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id FAA16255 for linux-list; Tue, 21 May 1996 05:42:51 GMT
Received: from ares.esd.sgi.com (fddi-ares.engr.sgi.com [192.26.80.60]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id WAA16249; Mon, 20 May 1996 22:42:49 -0700
Received: from fir.esd.sgi.com by ares.esd.sgi.com via ESMTP (951211.SGI.8.6.12.PATCH1042/950213.SGI.AUTOCF)
	 id WAA18921; Mon, 20 May 1996 22:42:49 -0700
Received: by fir.esd.sgi.com (940816.SGI.8.6.9/920502.SGI.AUTO)
	 id WAA04124; Mon, 20 May 1996 22:42:49 -0700
Date: Mon, 20 May 1996 22:42:49 -0700
From: wje@fir.esd.sgi.com (William J. Earl)
Message-Id: <199605210542.WAA04124@fir.esd.sgi.com>
To: ariel@cthulhu.engr.sgi.com
Cc: linux@cthulhu.engr.sgi.com
Subject: Re: Meeting wednesday
In-Reply-To: <199605210016.RAA05977@yon.engr.sgi.com>
References: <199605210016.RAA05977@yon.engr.sgi.com>
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

Ariel Faigon writes:
...
 > Bill: do you have 200MHz CPUs?
...

     I don't have the two modules I ordered, because we have a 
large backlog.  I will probably supply something reasonable
for the host machine and a 150 MHZ R5000 to put on the 
target machine, until the modules I ordered show up.

From owner-linux@cthulhu.engr.sgi.com  Mon May 20 23:25:29 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id XAA19680; Mon, 20 May 1996 23:25:29 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id GAA17803 for linux-list; Tue, 21 May 1996 06:25:25 GMT
Received: from yon.engr.sgi.com (yon.engr.sgi.com [150.166.61.32]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id XAA17798 for <linux@cthulhu.engr.sgi.com>; Mon, 20 May 1996 23:25:24 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by yon.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id XAA06414 for <linux@yon.engr.sgi.com>; Mon, 20 May 1996 23:24:59 -0700
Received: from sgi.sgi.com (sgi.engr.sgi.com [150.166.76.30]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id XAA17794 for <linux@yon.engr.sgi.com>; Mon, 20 May 1996 23:25:22 -0700
Received: from caipfs.rutgers.edu by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	for <linux@yon.engr.sgi.com> id XAA03232; Mon, 20 May 1996 23:25:21 -0700
Received: from huahaga.rutgers.edu (huahaga.rutgers.edu [128.6.155.53]) by caipfs.rutgers.edu (8.6.9+bestmx+oldruq+newsunq+grosshack/8.6.9) with ESMTP id CAA28911 for <linux@yon.engr.sgi.com>; Tue, 21 May 1996 02:25:20 -0400
Received: (davem@localhost) by huahaga.rutgers.edu (8.6.9+bestmx+oldruq+newsunq+grosshack/8.6.9) id CAA07005; Tue, 21 May 1996 02:25:20 -0400
Date: Tue, 21 May 1996 02:25:20 -0400
Message-Id: <199605210625.CAA07005@huahaga.rutgers.edu>
From: "David S. Miller" <davem@caip.rutgers.edu>
To: linux@yon.engr.sgi.com
Subject: some projections...
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk


Iris SCSI driver: Mildly amusing, once I get the dma stuff straight
		  should be a 3 or 4 day affair to get going in an
		  initial working state.

Iris Ethernet driver: Cake walk, 3 nights tops.

It seems the HPC drives both of these devices in a similar manner, it
also seems that it has little state machines which do things like
retransmit collision packets on the SEEQ and various SCSI sequences as
well.  I need to get it straight in my head where the software driver
actually comes into play.  I like the HPC dma architecture btw.

Console driver: I assume the keyboard/mouse is driven off the Zilog
		uarts, if so this should be relatively simple as I can
		adapt most of the code from my Sparc stuff and how I
		layed that code out.  As for the screen I just need
		to figure out where the frame buffer lives and how to
		play with the palette registers and I'm set.  Also
		should be cakewalk to do serial console as well. This
		might be a 4 or 5 day job depending upon how things
		go initially.

So in less than two weeks I could have drivers for all the major
devices going already.

A large section of my work will be carefully going over the existing
code the linux-mips people have and putting together the initial
foundation so that I can at least start compiling kernels, then things
can run smoothly.

Yawn, stretch, more to come...

Later,
David S. Miller
davem@caip.rutgers.edu

From owner-linux@cthulhu.engr.sgi.com  Mon May 20 23:49:58 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id XAA19712; Mon, 20 May 1996 23:49:58 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id GAA18798 for linux-list; Tue, 21 May 1996 06:49:54 GMT
Received: from yon.engr.sgi.com (yon.engr.sgi.com [150.166.61.32]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id XAA18793 for <linux@cthulhu.engr.sgi.com>; Mon, 20 May 1996 23:49:53 -0700
Received: from ares.esd.sgi.com (fddi-ares.engr.sgi.com [192.26.80.60]) by yon.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id XAA06436 for <linux@yon.engr.sgi.com>; Mon, 20 May 1996 23:49:28 -0700
Received: from fir.esd.sgi.com by ares.esd.sgi.com via ESMTP (951211.SGI.8.6.12.PATCH1042/950213.SGI.AUTOCF)
	 id XAA20269; Mon, 20 May 1996 23:49:51 -0700
Received: by fir.esd.sgi.com (940816.SGI.8.6.9/920502.SGI.AUTO)
	 id XAA14132; Mon, 20 May 1996 23:49:51 -0700
Date: Mon, 20 May 1996 23:49:51 -0700
From: wje@fir.esd.sgi.com (William J. Earl)
Message-Id: <199605210649.XAA14132@fir.esd.sgi.com>
To: "David S. Miller" <davem@caip.rutgers.edu>
Cc: linux@yon.engr.sgi.com
Subject: Re: some projections...
In-Reply-To: <199605210625.CAA07005@huahaga.rutgers.edu>
References: <199605210625.CAA07005@huahaga.rutgers.edu>
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

David S. Miller writes:
...
 > Console driver: I assume the keyboard/mouse is driven off the Zilog
 > 		uarts, if so this should be relatively simple as I can
 > 		adapt most of the code from my Sparc stuff and how I
 > 		layed that code out.  As for the screen I just need
 > 		to figure out where the frame buffer lives and how to
 > 		play with the palette registers and I'm set.  Also
 > 		should be cakewalk to do serial console as well. This
 > 		might be a 4 or 5 day job depending upon how things
 > 		go initially.
...

    The keyboard and mouse, on the Indy and Indigo2, are driven by
a PS2-style keyboard and mouse controller.  Aside from the different 
way of addressing the registers, the driver should pretty much be the
same as the Linux PS2 keyboard and mouse controller.  

    The frame buffer is not directly addressable.  Most operations
are performed via commands written to the pixel pipeline, although
one can DMA pixels from main memory to the frame buffer.  I will
arrange for you to talk with the people who do IRIX X and GL
for Newport.

    It will probably be a good idea to boot up on a serial console
first, since the graphics interface is a bit more complex than a dumb
frame buffer.  That is how we usually port IRIX to a new system.
Since there are two serial ports, one can run the console on the first
port and the remote debugger on the second port, with very little
working beside the serial driver and bsaic exception handling
(assuming you start by booting using bootp() and the standard PROM).


From owner-linux@cthulhu.engr.sgi.com  Mon May 20 23:53:05 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id XAA19717; Mon, 20 May 1996 23:53:05 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id GAA18928 for linux-list; Tue, 21 May 1996 06:53:01 GMT
Received: from yon.engr.sgi.com (yon.engr.sgi.com [150.166.61.32]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id XAA18923 for <linux@cthulhu.engr.sgi.com>; Mon, 20 May 1996 23:53:00 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by yon.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id XAA06446 for <linux@yon.engr.sgi.com>; Mon, 20 May 1996 23:52:36 -0700
Received: from sgi.sgi.com (sgi.engr.sgi.com [150.166.76.30]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id XAA18918 for <linux@yon.engr.sgi.com>; Mon, 20 May 1996 23:52:58 -0700
Received: from caipfs.rutgers.edu by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	 id XAA04842; Mon, 20 May 1996 23:52:56 -0700
Received: from huahaga.rutgers.edu (huahaga.rutgers.edu [128.6.155.53]) by caipfs.rutgers.edu (8.6.9+bestmx+oldruq+newsunq+grosshack/8.6.9) with ESMTP id CAA29861; Tue, 21 May 1996 02:52:55 -0400
Received: (davem@localhost) by huahaga.rutgers.edu (8.6.9+bestmx+oldruq+newsunq+grosshack/8.6.9) id CAA07073; Tue, 21 May 1996 02:52:55 -0400
Date: Tue, 21 May 1996 02:52:55 -0400
Message-Id: <199605210652.CAA07073@huahaga.rutgers.edu>
From: "David S. Miller" <davem@caip.rutgers.edu>
To: wje@fir.esd.sgi.com
CC: linux@yon.engr.sgi.com
In-reply-to: <199605210649.XAA14132@fir.esd.sgi.com> (wje@fir.esd.sgi.com)
Subject: Re: some projections...
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

   Date: Mon, 20 May 1996 23:49:51 -0700
   From: wje@fir.esd.sgi.com (William J. Earl)

       The keyboard and mouse, on the Indy and Indigo2, are driven by
   a PS2-style keyboard and mouse controller.

Good, thanks.

       It will probably be a good idea to boot up on a serial console
   first, since the graphics interface is a bit more complex than a dumb
   frame buffer.  That is how we usually port IRIX to a new system.

Good idea, this will be the direction I go in then.  Although, I can't
wait to have Linux virtual consoles available on an SGI workstation ;-)

Later,
David S. Miller
davem@caip.rutgers.edu

From owner-linux@cthulhu.engr.sgi.com  Tue May 21 08:20:37 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id IAA21301; Tue, 21 May 1996 08:20:37 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id PAA17434 for linux-list; Tue, 21 May 1996 15:20:32 GMT
Received: from piecomputer.corp.sgi.com (piecomputer.corp.sgi.com [192.102.145.157]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id IAA17426; Tue, 21 May 1996 08:20:31 -0700
Received: by piecomputer.corp.sgi.com (950413.SGI.8.6.12/930416.SGI)
	 id IAA15111; Tue, 21 May 1996 08:20:30 -0700
From: "Bob Mende Pie" <mende@piecomputer.corp.sgi.com>
Message-Id: <9605210820.ZM15109@piecomputer.corp.sgi.com>
Date: Tue, 21 May 1996 08:20:30 -0700
In-Reply-To: ariel@yon.engr.sgi.com (Ariel Faigon)
        "Meeting wednesday" (May 20,  5:16pm)
References: <199605210016.RAA05977@yon.engr.sgi.com>
X-URL: http://reality.sgi.com/employees/mende/
X-Mailer: Z-Mail-SGI (3.2S.2 10apr95 MediaMail)
To: ariel@cthulhu.engr.sgi.com, linux@cthulhu.engr.sgi.com
Subject: Re: Meeting wednesday
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

Ariel,
    What time did you schedule the meeting for???


-- 
				      /Bob...			 mende@sgi.com
				http://reality.sgi.com/mende/

From owner-linux@cthulhu.engr.sgi.com  Tue May 21 09:50:13 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id JAA03256; Tue, 21 May 1996 09:50:13 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id QAA00843 for linux-list; Tue, 21 May 1996 16:50:09 GMT
Received: from yon.engr.sgi.com (yon.engr.sgi.com [150.166.61.32]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id JAA00838 for <linux@cthulhu.engr.sgi.com>; Tue, 21 May 1996 09:50:08 -0700
Received: (from ariel@localhost) by yon.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id JAA06906 for linux; Tue, 21 May 1996 09:49:44 -0700
From: ariel@yon.engr.sgi.com (Ariel Faigon)
Message-Id: <199605211649.JAA06906@yon.engr.sgi.com>
Subject: Wednesday meeting
To: linux@yon.engr.sgi.com
Date: Tue, 21 May 1996 09:49:44 -0700 (PDT)
Reply-To: ariel@cthulhu.engr.sgi.com
Organization: Silicon Graphics Inc.
X-Mailer: ELM [version 2.4 PL24 ME5a]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length:        173
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

Oops sorry, I left the hour out:

	Linux meeting
	Matisse Room 9 upper by the kitchen area
	Wednesday (tomorrow)  1pm to 2pm

	(David is arriving Friday.)

-- 
Peace, Ariel

From owner-linux@cthulhu.engr.sgi.com  Wed May 29 15:00:29 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id PAA01769; Wed, 29 May 1996 15:00:29 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id WAA05014 for linux-list; Wed, 29 May 1996 22:00:22 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id PAA05005 for <linux@cthulhu.engr.sgi.com>; Wed, 29 May 1996 15:00:21 -0700
Received: from lanta.engr.sgi.com (lanta.engr.sgi.com [192.26.75.15]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id PAA01749 for <lmlinux@neteng.engr.sgi.com>; Wed, 29 May 1996 15:00:21 -0700
Received: by lanta.engr.sgi.com (940816.SGI.8.6.9/911001.SGI)
	 id OAA09070; Wed, 29 May 1996 14:59:58 -0700
Date: Wed, 29 May 1996 14:59:58 -0700
From: nn@lanta.engr.sgi.com (Neal Nuckolls)
Message-Id: <199605292159.OAA09070@lanta.engr.sgi.com>
To: torvalds@cs.helsinki.fi, alan@cymru.net
Subject: linux needs bsd networking stack
Cc: sparclinux-cvs@caipfs.rutgers.edu, lmlinux@neteng.engr.sgi.com
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk


Silicon Valley is bubbling with networking startups.
Many of these new small companies are designing products
based on PC motherboards and doing some sw and/or hw
customization to turn them into networking switches,
routers, firewalls, etc.  Rather than embedding a RTOS,
they are choosing a free unix and usually this is FreeBSD
since Linux networking is not the de facto BSD stack.
The "unique" tcp/ip implementation is a liability to linux.
Is anyone working to replace the standard linux stack
with port derived from the 4.4BSD code?

thanks.

neal nuckolls
nn@engr.sgi.com

From owner-linux@cthulhu.engr.sgi.com  Wed May 29 15:20:55 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id PAA02708; Wed, 29 May 1996 15:20:54 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id WAA09491 for linux-list; Wed, 29 May 1996 22:20:50 GMT
Received: from yon.engr.sgi.com (yon.engr.sgi.com [150.166.61.32]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id PAA09469 for <linux@cthulhu.engr.sgi.com>; Wed, 29 May 1996 15:20:48 -0700
Received: (from ariel@localhost) by yon.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id PAA20059 for linux; Wed, 29 May 1996 15:20:16 -0700
From: ariel@yon.engr.sgi.com (Ariel Faigon)
Message-Id: <199605292220.PAA20059@yon.engr.sgi.com>
Subject: Come say hello to David
To: linux@yon.engr.sgi.com
Date: Wed, 29 May 1996 15:20:15 -0700 (PDT)
Reply-To: ariel@cthulhu.engr.sgi.com
Organization: Silicon Graphics Inc.
X-Mailer: ELM [version 2.4 PL24 ME5a]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length:        261
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

For all those who are on the list, haven't met Dave yet,
and are in Mountain View.

Dave is in Building 9-Upper, North-Western corner,
across from Kim Vogt, besides the Schimmels.

Please drop by, introduce yourself, see how can you help, etc.
-- 
Peace, Ariel

From owner-linux@cthulhu.engr.sgi.com  Wed May 29 15:50:31 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id PAA03273; Wed, 29 May 1996 15:50:30 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id WAA13722 for linux-list; Wed, 29 May 1996 22:50:25 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id PAA13701 for <linux@cthulhu.engr.sgi.com>; Wed, 29 May 1996 15:50:24 -0700
Received: (from dm@localhost) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id PAA03268; Wed, 29 May 1996 15:50:23 -0700
Date: Wed, 29 May 1996 15:50:23 -0700
Message-Id: <199605292250.PAA03268@neteng.engr.sgi.com>
From: "David S. Miller" <dm@neteng.engr.sgi.com>
To: nn@lanta.engr.sgi.com
CC: torvalds@cs.helsinki.fi, alan@cymru.net, sparclinux-cvs@caipfs.rutgers.edu,
        lmlinux@neteng.engr.sgi.com
In-reply-to: <199605292159.OAA09070@lanta.engr.sgi.com> (nn@lanta)
Subject: Re: linux needs bsd networking stack
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

   Date: Wed, 29 May 1996 14:59:58 -0700
   From: nn@lanta (Neal Nuckolls)

   The "unique" tcp/ip implementation is a liability to linux.

It could also be one of it's greatest assets, and I think this will
turn out to be the case.

   Is anyone working to replace the standard linux stack
   with port derived from the 4.4BSD code?

I will for now only briefly mention why I think this is not very
desirable.

A couple of weeks ago, Larry was babbling to me "oh the stack is
sloowww, I can't push nearly as much over 100mb/s ether as freebsd
can, etc."  I said, "thats peculiar" so I did some investigation and
told Linus about it.  Turned out to be a driver bug and after that was
fixed the over the wire numbers are unparalleled.

The Berkeley stack is dead, but it has one redeeming quality which
Linux's stack does desperately need.

It has a well defined architecture, I will agree with lm when he
mentions that it is a jungle of code to sift through in certain
respects.  It need a mallet to smooth certain aspects and interfaces
out.

In the end I think it is best to work on hacking the existing (and
upcoming) Linux networking code to have these qualities instead of
stuffing the bsd stack into linux (this has been done before a long
long time ago btw, before linux had any networking, a man by the name
of Charles Hedrick back at Rutgers did it in a few nights).

I think the feeling that the linux stack is "hard to follow" or "has
very little architecture" has a lot to do with the fact that we don't
have 20 books analyzing the code c-statement by c-statement like the
bsd stuff does.  If we had that, I think this desire to use the
berkeley stack would not be as strong.

I dislike the berkeley stack, but I am biased in my opinion.  I am
biased because of the attitude expressed by the people actively
working on that code set in the free software world these days, I am
also biased because I tend to hack Linux almost exlusively.  But, even
barring that I believe that some of the elements of the bsd stack will
end up being completely flawed when plugged into linux, obvious things
like mbufs and other things come to mind right now.  It would require
a bit of engineering and greatly upset a large community who has put
their entire heart and soul into the Linux networking code.  I believe
at the very least that the Linux networking stack is superior
performance wise without any question, and as everyone knows I have
numbers to prove it ;-)

Later,
David S. Miller
dm@sgi.com


From owner-linux@cthulhu.engr.sgi.com  Wed May 29 16:04:45 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id QAA06150; Wed, 29 May 1996 16:04:45 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id XAA16762 for linux-list; Wed, 29 May 1996 23:04:40 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id QAA16744 for <linux@cthulhu.engr.sgi.com>; Wed, 29 May 1996 16:04:38 -0700
Received: from localhost (lm@localhost) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via SMTP id QAA06115; Wed, 29 May 1996 16:04:37 -0700
Message-Id: <199605292304.QAA06115@neteng.engr.sgi.com>
To: dm@neteng.engr.sgi.com
From: lm@gate1-neteng.engr.sgi.com (Larry McVoy)
cc: nn@lanta.engr.sgi.com, torvalds@cs.helsinki.fi, alan@cymru.net,
        sparclinux-cvs@caipfs.rutgers.edu, lmlinux@neteng.engr.sgi.com
Subject: Re: linux needs bsd networking stack 
Date: Wed, 29 May 1996 16:04:37 -0700
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

:    The "unique" tcp/ip implementation is a liability to linux.
: 
: It could also be one of it's greatest assets, and I think this will
: turn out to be the case.
: 
:    Is anyone working to replace the standard linux stack
:    with port derived from the 4.4BSD code?
: 
: A couple of weeks ago, Larry was babbling to me "oh the stack is
: sloowww, I can't push nearly as much over 100mb/s ether as freebsd
: can, etc."  I said, "thats peculiar" so I did some investigation and
: told Linus about it.  Turned out to be a driver bug and after that was
: fixed the over the wire numbers are unparalleled.

That's not quite true, I think the BSD numbers are still better, I'll
have full lmbench apples to apples runs on the same hardware at the 
end of this week.

: It [bsd] has a well defined architecture, I will agree with lm when he
: mentions that it is a jungle of code to sift through in certain
: respects.  

This is one of my complaints.  The BSD stack has a defined set of "objects"
for dealing with networking; an incomplete list:

	protocol structure for different address families
	interface structure for different media types
	socket structure that cleanly handles different protocols

Another big plus of the BSD stack is tcp_input.c and tcp_output.c.  These
are what most people mean when they say "BSD networking".

Downsides of BSD:

	. I don't particlularly like mbufs; I agree with Linus & Alan that
	they are overkill.  

	. There are layering "invariants" that affect performance: you really
	should allocate your send buffers from the interface driver, because
	it could do some interesting things that would minimize cache flushing.
	I think Van's prototype did this for witless cards.

	. Single processor design.  This is the biggest drawback, IMO.

Proposal/suggestion:

	. Come up with a strawman proposal for the set of "objects" we think
	  we need in Linux.  Do this as part of the work Linus suggested to
	  merge the socket ops with the vfs ops.

	. Steal the TCP code outright.  Nuke the mbuf stuff, use the skbufs
	  or a slightly modified version thereof.

	. Design in SMP support from the start.  This means thinking about
	  thousands of connections running in parallel.

: I think the feeling that the linux stack is "hard to follow" or "has
: very little architecture" has a lot to do with the fact that we don't
: have 20 books analyzing the code c-statement by c-statement like the
: bsd stuff does.  If we had that, I think this desire to use the
: berkeley stack would not be as strong.

Yeah, but a very reasonable point is "we don't have that".  BSD does.
This is a big deal.  Documentation is useful.

From owner-linux@cthulhu.engr.sgi.com  Wed May 29 17:36:27 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id RAA16394; Wed, 29 May 1996 17:36:27 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id AAA03431 for linux-list; Thu, 30 May 1996 00:36:21 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id RAA03420 for <linux@cthulhu.engr.sgi.com>; Wed, 29 May 1996 17:36:20 -0700
Received: from lanta.engr.sgi.com (lanta.engr.sgi.com [192.26.75.15]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id RAA16390; Wed, 29 May 1996 17:36:20 -0700
Received: by lanta.engr.sgi.com (940816.SGI.8.6.9/911001.SGI)
	 id RAA09665; Wed, 29 May 1996 17:36:19 -0700
Date: Wed, 29 May 1996 17:36:19 -0700
From: nn@lanta.engr.sgi.com (Neal Nuckolls)
Message-Id: <199605300036.RAA09665@lanta.engr.sgi.com>
To: dm@neteng.engr.sgi.com
Subject: Re: linux needs bsd networking stack
Cc: lmlinux@neteng.engr.sgi.com, sparclinux-cvs@caipfs.rutgers.edu,
        alan@cymru.net, torvalds@cs.helsinki.fi
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk


>> The "unique" tcp/ip implementation is a liability to linux.
>
> It could also be one of it's greatest assets, and I think this will
> turn out to be the case.

Whether the linux kernel networking implementation is better or
worse than the BSD code isn't my point. The fact that it's not
clearly superior, only very different, from the standard is.

Most unix and internet R&D community protocol development has
been and continues to be within a BSD environment which means that
BSD-based kernel networking code is prevalent. If I'm doing some
work in this area I can readily grab many free BSD-based protocol
pieceparts off the net. New routing protocols, ATM signalling, TCP
conjestion improvements, realtime protocol stacks, etc. are all
developed in a BSD kernel networking environment.  Have been for
years. That's not likely to change. There are hundreds of people
out there that really know BSD networking. This availability of
people and code makes it the standard.

Actually, for the startups that I mentioned - those interested in
shipping a commercial product - there is no choice, it's FreeBSD,
because it comes without the GPL kiss of death.

> In the end I think it is best to work on hacking the existing (and
> upcoming) Linux networking code to have these qualities instead of
> stuffing the bsd stack into linux

I think that basing any improvements on a 4.4BSD-based linux stack
would result in something more usable.  Also, as a side effect, it
would encourage more talented networking people to participate and
isn't this what freeware is all about?

neal

From owner-linux@cthulhu.engr.sgi.com  Wed May 29 20:02:52 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id UAA07685; Wed, 29 May 1996 20:02:52 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id DAA16701 for linux-list; Thu, 30 May 1996 03:02:47 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id UAA16690 for <linux@cthulhu.engr.sgi.com>; Wed, 29 May 1996 20:02:46 -0700
Received: (from dm@localhost) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id UAA07679; Wed, 29 May 1996 20:02:45 -0700
Date: Wed, 29 May 1996 20:02:45 -0700
Message-Id: <199605300302.UAA07679@neteng.engr.sgi.com>
From: "David S. Miller" <dm@neteng.engr.sgi.com>
To: nn@lanta.engr.sgi.com
CC: lmlinux@neteng.engr.sgi.com, sparclinux-cvs@caipfs.rutgers.edu,
        alan@cymru.net, torvalds@cs.helsinki.fi
In-reply-to: <199605300036.RAA09665@lanta.engr.sgi.com> (nn@lanta)
Subject: Re: linux needs bsd networking stack
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

   Date: Wed, 29 May 1996 17:36:19 -0700
   From: nn@lanta (Neal Nuckolls)

   >> The "unique" tcp/ip implementation is a liability to linux.
   >
   > It could also be one of it's greatest assets, and I think this will
   > turn out to be the case.

   Most unix and internet R&D community protocol development has
   been and continues to be within a BSD environment which means that
   BSD-based kernel networking code is prevalent. If I'm doing some
   work in this area I can readily grab many free BSD-based protocol
   pieceparts off the net. New routing protocols, ATM signalling, TCP
   conjestion improvements, realtime protocol stacks, etc. are all
   developed in a BSD kernel networking environment.  Have been for
   years. That's not likely to change. There are hundreds of people
   out there that really know BSD networking. This availability of
   people and code makes it the standard.

Actually, whats funny is that this is exactly what the people doing
work on the linux stack have in fact done.  Check out the ideas people
have put into various berkeley based ventures and research, place our
own implementation of the idea into the linux stack, and then improve
upon it.  The idea behind publicly available standards (at least I
believe) was to promote no single entity to have this "defacto" thing
which controlled the protocol or what have you, the berkeley
phenomenon is not encouraging this idea and is in fact against it.
Here the berkeley interpretation and implementation is being
considered "the" interpretation and "the" implementation.  The Linux
code in general moves at an advanced rate, it is constantly evolving
and changing for the purposes of improvement.  One could argue that
this is what makes it so hard to pin down and become very acquainted
with it and not just the fact that it's objects and interfaces are not
as well defined (as lm mentions, of which I whole heartedly agree, I
would some day like to see the Linux networking be as intuitive and
seamless as the vm/mm Linux layers, Linus has mentioned moves in this
direction via fully integrating the socks more completely into the
inode among other things).

   Actually, for the startups that I mentioned - those interested in
   shipping a commercial product - there is no choice, it's FreeBSD,
   because it comes without the GPL kiss of death.

I wish it wouldn't come down to a discussion about "what license is
better for who and why", I'd rather this be about technical merit.
But, I am beginning to realize that this may not be possible.  I want
Linux to strive and always be on the bleeding edge, as it has been,
personally I believe that the "GPL kiss of death" is what makes that
possible and will guarentee that this capability cannot be taken
away.

   I think that basing any improvements on a 4.4BSD-based linux stack
   would result in something more usable.  Also, as a side effect, it
   would encourage more talented networking people to participate and
   isn't this what freeware is all about?

You are correct, this is what "free software" (there is a distinction)
is all about.  However what it is not about, and what could kill free
software is if the "BSD kiss of death" allowed someone to make
significant improvements to the Linux code and due to the berkeley
license allowed that entity to keep those improvements to themselves
so that the free software community loses out.  This is precisely what
the GPL tries to avoid...  Very frequently, Linus himself mentions
that the GPL is what has made Linux as powerful as it now is, and he
also frequently mentions that GPL'ing the Linux code is the best
decision he has ever made.  I tend to agree with him.

But like I said, I'd rather such a decision be made based upon a
technical decision not a "who has the viable licensing terms"
decision.

It was mentioned that you are more familiar with the berkeley code and
many people are.  I am more familiar with the Linux code and the
work/research which has gone into it, do I have an argument as well?
I would not present it this way or drive an argument in that fashion I
think.  There exists a significant group of people who are in my
position and as well there are many in the position you speak of.
However, I have been making a significant effort over the past year or
so to become familiar with the berkeley code so that I possess the
ability to tackle decisions like this based on technical merit, and
not turn it into a GPL vs. BSD discussion.  Are people supportive of
the berkeley side of this argument willing to do the same?  I know a
few such as lm etc., I would not say they exist within the majority
though.

Later,
David S. Miller
dm@sgi.com

From owner-linux@cthulhu.engr.sgi.com  Wed May 29 22:22:26 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id WAA19895; Wed, 29 May 1996 22:22:26 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id FAA27281 for linux-list; Thu, 30 May 1996 05:22:20 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id WAA27276 for <linux@cthulhu.engr.sgi.com>; Wed, 29 May 1996 22:22:19 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id WAA19892 for <lmlinux@neteng.engr.sgi.com>; Wed, 29 May 1996 22:22:19 -0700
Received: from sgi.sgi.com (sgi.engr.sgi.com [150.166.76.30]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id WAA27269; Wed, 29 May 1996 22:22:17 -0700
Received: from porsta.cs.Helsinki.FI by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	 id WAA29720; Wed, 29 May 1996 22:22:14 -0700
Received: from linux.cs.Helsinki.FI (linux.cs.Helsinki.FI [128.214.48.39]) by porsta.cs.Helsinki.FI (8.6.10/8.6.9) with SMTP id IAA26571; Thu, 30 May 1996 08:22:12 +0300
Date: Thu, 30 May 1996 08:21:42 +0300 (EET DST)
From: Linus Torvalds <torvalds@cs.Helsinki.FI>
To: Neal Nuckolls <nn@lanta.engr.sgi.com>
cc: alan@cymru.net, sparclinux-cvs@caipfs.rutgers.edu,
        lmlinux@neteng.engr.sgi.com
Subject: Re: linux needs bsd networking stack
In-Reply-To: <199605292159.OAA09070@lanta.engr.sgi.com>
Message-ID: <Pine.LNX.3.91.960530080714.20038B-100000@linux.cs.Helsinki.FI>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk



On Wed, 29 May 1996, Neal Nuckolls wrote:
> 
> Silicon Valley is bubbling with networking startups.
> Many of these new small companies are designing products
> based on PC motherboards and doing some sw and/or hw
> customization to turn them into networking switches,
> routers, firewalls, etc.  Rather than embedding a RTOS,
> they are choosing a free unix and usually this is FreeBSD
> since Linux networking is not the de facto BSD stack.
> The "unique" tcp/ip implementation is a liability to linux.
> Is anyone working to replace the standard linux stack
> with port derived from the 4.4BSD code?

Simple answer: it won't happen.

The _only_ advantage of the BSD stack is the de-facto standard thing, and 
quite frankly that one doesn't make much of a difference - Linux _will_ 
be the facto standard in one or two more years if everything goes right. 
Trying to port the BSD stack would be a major mistake, imho.

I used to think the BSD stack was an option that we might want to take some
day, but I don't think so any more. My main concerns were performance and
compatibility, and we've got them both. The problems we still have in
networking are not worth worrying about in this context - we'd have a lot
more problems if we tried to switch and they wouldn't be any easier to solve. 

Note that this doesn't mean we shouldn't look at parts of the BSD stack 
for interesting things (and the BSD stack has obviously been there as a 
reference). But a whole-sale BSD stack port is not going to happen.

		Linus

From owner-linux@cthulhu.engr.sgi.com  Thu May 30 02:49:44 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id CAA26281; Thu, 30 May 1996 02:49:44 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id JAA16929 for linux-list; Thu, 30 May 1996 09:49:38 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id CAA16924 for <linux@cthulhu.engr.sgi.com>; Thu, 30 May 1996 02:49:36 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id CAA26278 for <lmlinux@neteng.engr.sgi.com>; Thu, 30 May 1996 02:49:36 -0700
Received: from sgi.sgi.com (sgi.engr.sgi.com [150.166.76.30]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id CAA16917; Thu, 30 May 1996 02:49:34 -0700
Received: from snowcrash.cymru.net by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	 id CAA16791; Thu, 30 May 1996 02:49:28 -0700
Received: (from alan@localhost) by snowcrash.cymru.net (8.7.1/8.7.1) id KAA23935; Thu, 30 May 1996 10:43:44 +0100
From: Alan Cox <alan@cymru.net>
Message-Id: <199605300943.KAA23935@snowcrash.cymru.net>
Subject: Re: linux needs bsd networking stack
To: nn@lanta.engr.sgi.com (Neal Nuckolls)
Date: Thu, 30 May 1996 10:43:42 +0100 (BST)
Cc: torvalds@cs.helsinki.fi, alan@cymru.net, sparclinux-cvs@caipfs.rutgers.edu,
        lmlinux@neteng.engr.sgi.com
In-Reply-To: <199605292159.OAA09070@lanta.engr.sgi.com> from "Neal Nuckolls" at May 29, 96 02:59:58 pm
X-Mailer: ELM [version 2.4 PL24]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

> customization to turn them into networking switches,
> routers, firewalls, etc.  Rather than embedding a RTOS,
> they are choosing a free unix and usually this is FreeBSD
> since Linux networking is not the de facto BSD stack.

So we should use a defacto BSD stack because its a defacto stack. Ok there is this
great OS called windows3. See you later

> The "unique" tcp/ip implementation is a liability to linux.

I'm not convinced it is. A whole load of SGI people (LM notably) seem intent on "BSD
stack, BSD stack, BSD stack". Everyone else I hear is saying "How fast can it go",
"How stable can we make it", "Will you please make sure its as solid in 2.0 as in 1.2"

> Is anyone working to replace the standard linux stack
> with port derived from the 4.4BSD code?

No -

o	The BSD stack doesnt do IPX, AX25, NetROM, Appletalk
o	There will be no defacto IPv6 for BSD, there are several species
o	The licensing doesnt permit the two to meet easily
o	You can't do 400Mbits/second with mbufs so you'd have to break the BSD code
	anyway

Im not convinced about the rest of the argument either. I know one big vendor using
the BSD stack for a project. I know several using Linux (Things like the firewall
from Mazama). We are seeing primary rate ISDN support for Linux starting to appear,
and already have the heavy provider end multiple serial cards.

For routers, anyone using a PC style architecture is bounding themselves to small
routers anyway. No matter how good the code is you will soon need fancy hardware to
handle BGP4, 50,000 routes and fast 100baseT speed switching. And there is no
defacto BSD IPv6

Alan


From owner-linux@cthulhu.engr.sgi.com  Thu May 30 02:50:03 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id CAA26290; Thu, 30 May 1996 02:50:03 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id JAA17002 for linux-list; Thu, 30 May 1996 09:49:59 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id CAA16997 for <linux@cthulhu.engr.sgi.com>; Thu, 30 May 1996 02:49:58 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id CAA26286; Thu, 30 May 1996 02:49:57 -0700
Received: from sgi.sgi.com (sgi.engr.sgi.com [150.166.76.30]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id CAA16992; Thu, 30 May 1996 02:49:56 -0700
Received: from snowcrash.cymru.net by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	 id CAA16808; Thu, 30 May 1996 02:49:53 -0700
Received: (from alan@localhost) by snowcrash.cymru.net (8.7.1/8.7.1) id KAA24198; Thu, 30 May 1996 10:46:48 +0100
From: Alan Cox <alan@cymru.net>
Message-Id: <199605300946.KAA24198@snowcrash.cymru.net>
Subject: Re: linux needs bsd networking stack
To: dm@neteng.engr.sgi.com (David S. Miller)
Date: Thu, 30 May 1996 10:46:46 +0100 (BST)
Cc: nn@lanta.engr.sgi.com, torvalds@cs.helsinki.fi, alan@cymru.net,
        sparclinux-cvs@caipfs.rutgers.edu, lmlinux@neteng.engr.sgi.com
In-Reply-To: <199605292250.PAA03268@neteng.engr.sgi.com> from "David S. Miller" at May 29, 96 03:50:23 pm
X-Mailer: ELM [version 2.4 PL24]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

> It has a well defined architecture, I will agree with lm when he
> mentions that it is a jungle of code to sift through in certain
> respects.  It need a mallet to smooth certain aspects and interfaces
> out.

I hope to be working full time on this eventually (like when people are paying me
for it). I've also started documentation at the driver and skbuff level and will
work over it bit by bit - see forthcoming Linux Journal articles then going into the
KHG.

> their entire heart and soul into the Linux networking code.  I believe
> at the very least that the Linux networking stack is superior
> performance wise without any question, and as everyone knows I have
> numbers to prove it ;-)

Before you admire the performance numbers get the pre2.1 code off Pedro Roque. Now he
has added really neat header prediction code it kicks pre2.0's butt even though its
doing a surplus memory alloc we need to tidy up

Alan


From owner-linux@cthulhu.engr.sgi.com  Thu May 30 03:26:35 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id DAA27714; Thu, 30 May 1996 03:26:35 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id KAA19576 for linux-list; Thu, 30 May 1996 10:26:29 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id DAA19571 for <linux@cthulhu.engr.sgi.com>; Thu, 30 May 1996 03:26:28 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id DAA27710; Thu, 30 May 1996 03:26:28 -0700
Received: from sgi.sgi.com (sgi.engr.sgi.com [150.166.76.30]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id DAA19562; Thu, 30 May 1996 03:26:26 -0700
Received: from snowcrash.cymru.net by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	 id DAA19088; Thu, 30 May 1996 03:26:04 -0700
Received: (from alan@localhost) by snowcrash.cymru.net (8.7.1/8.7.1) id LAA24835; Thu, 30 May 1996 11:12:14 +0100
From: Alan Cox <alan@cymru.net>
Message-Id: <199605301012.LAA24835@snowcrash.cymru.net>
Subject: Re: linux needs bsd networking stack
To: nn@lanta.engr.sgi.com (Neal Nuckolls)
Date: Thu, 30 May 1996 11:12:12 +0100 (BST)
Cc: dm@neteng.engr.sgi.com, lmlinux@neteng.engr.sgi.com,
        sparclinux-cvs@caipfs.rutgers.edu, alan@cymru.net,
        torvalds@cs.helsinki.fi
In-Reply-To: <199605300036.RAA09665@lanta.engr.sgi.com> from "Neal Nuckolls" at May 29, 96 05:36:19 pm
X-Mailer: ELM [version 2.4 PL24]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

> Whether the linux kernel networking implementation is better or
> worse than the BSD code isn't my point. The fact that it's not
> clearly superior, only very different, from the standard is.

Chuckle. The BSD stack doesn't even match the RFC's [ie THE STANDARD]

> work in this area I can readily grab many free BSD-based protocol
> pieceparts off the net. New routing protocols, ATM signalling, TCP
> conjestion improvements, realtime protocol stacks, etc. are all

Let me see: Routing protocols is userspace. ATM signalling we have (going in
2.1), Vegas we have in the pre 2.1 stuff. realtime - if you mean low latency
then take a look at unet.

> Actually, for the startups that I mentioned - those interested in
> shipping a commercial product - there is no choice, it's FreeBSD,
> because it comes without the GPL kiss of death.

So "We should change it for the startups" was message 1. "The startups wont
use it was message two". Do you have bullet holes in your shoes by any
chance ?

> I think that basing any improvements on a 4.4BSD-based linux stack
> would result in something more usable.  Also, as a side effect, it
> would encourage more talented networking people to participate and
> isn't this what freeware is all about?

We can't use the 4.4BSD stack so that issue is moot. Free software is also about high
technical quality. If I wanted to get paid lots of money for hacking kernels I'd go
and work for mirkosoft


From owner-linux@cthulhu.engr.sgi.com  Thu May 30 03:27:05 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id DAA27727; Thu, 30 May 1996 03:27:04 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id KAA19629 for linux-list; Thu, 30 May 1996 10:27:00 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id DAA19624 for <linux@cthulhu.engr.sgi.com>; Thu, 30 May 1996 03:26:59 -0700
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id DAA27722; Thu, 30 May 1996 03:26:58 -0700
Received: from sgi.sgi.com (sgi.engr.sgi.com [150.166.76.30]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id DAA19616; Thu, 30 May 1996 03:26:56 -0700
Received: from snowcrash.cymru.net by sgi.sgi.com via ESMTP (950405.SGI.8.6.12/910110.SGI)
	 id DAA19138; Thu, 30 May 1996 03:26:50 -0700
Received: (from alan@localhost) by snowcrash.cymru.net (8.7.1/8.7.1) id LAA24792; Thu, 30 May 1996 11:06:27 +0100
From: Alan Cox <alan@cymru.net>
Message-Id: <199605301006.LAA24792@snowcrash.cymru.net>
Subject: Re: linux needs bsd networking stack
To: lm@gate1-neteng.engr.sgi.com (Larry McVoy)
Date: Thu, 30 May 1996 11:06:25 +0100 (BST)
Cc: dm@neteng.engr.sgi.com, nn@lanta.engr.sgi.com, torvalds@cs.helsinki.fi,
        alan@cymru.net, sparclinux-cvs@caipfs.rutgers.edu,
        lmlinux@neteng.engr.sgi.com
In-Reply-To: <199605292304.QAA06115@neteng.engr.sgi.com> from "Larry McVoy" at May 29, 96 04:04:37 pm
X-Mailer: ELM [version 2.4 PL24]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

> This is one of my complaints.  The BSD stack has a defined set of "objects"
> for dealing with networking; an incomplete list:
> 
> 	protocol structure for different address families

We have these.

> 	interface structure for different media types

We have these

> 	socket structure that cleanly handles different protocols

We have most of this.

> Another big plus of the BSD stack is tcp_input.c and tcp_output.c.  These
> are what most people mean when they say "BSD networking".

Yep. For 2.0 we have all the core stuff. For pre 2.1 we have stuff going beyond
what BSD is doing - Vegas flow control. If you want to work on stealing and working
stuff in talk with Pedro Roque, get pre 2.1 and take a good look.

> 	. There are layering "invariants" that affect performance: you really
> 	should allocate your send buffers from the interface driver, because
> 	it could do some interesting things that would minimize cache flushing.
> 	I think Van's prototype did this for witless cards.

We could certainly add the allocate via device at very little cost. We just start
using dev->kmalloc(). Note that you can't always win on this a packet may change
route.

> 	. Single processor design.  This is the biggest drawback, IMO.

The Linux one is semi designed for MP work. A given socket can do its own locking,
and there are only a small number of areas of overlap at the moment. Notably

Demultiplex table needs locks
Multiple processor writes in parallel/reads in parallel on datagram requires about
	10 lines of lock code.
We dont run the net_bh() in parallel although most of it we can do very easily.

> Proposal/suggestion:
> 	. Come up with a strawman proposal for the set of "objects" we think
> 	  we need in Linux.  Do this as part of the work Linus suggested to
> 	  merge the socket ops with the vfs ops.

That certainly cant be counter-productive.

> 	. Steal the TCP code outright.  Nuke the mbuf stuff, use the skbufs
> 	  or a slightly modified version thereof.

We can't steal it outright. 4.4BSD has abominable problems as is. The FreeBSD people
have the worst of them fixed but don't have stuff like Vegas and have all the 
horrible spoofing problems caused by bad timers.

> 	. Design in SMP support from the start.  This means thinking about
> 	  thousands of connections running in parallel.

So long as it still runs very fast for the 99.99% of people with a single CPU 486 PC.
Thats the primary target by market volume.

> : have 20 books analyzing the code c-statement by c-statement like the
> : bsd stuff does.  If we had that, I think this desire to use the
> : berkeley stack would not be as strong.
> Yeah, but a very reasonable point is "we don't have that".  BSD does.
> This is a big deal.  Documentation is useful.

Its coming. In fact AW should now have released an English language version of the
1.2 kernel programming book. 

Alan


From owner-linux@cthulhu.engr.sgi.com  Thu May 30 11:21:36 1996
Received: from cthulhu.engr.sgi.com (cthulhu.engr.sgi.com [192.26.80.2]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id LAA08947; Thu, 30 May 1996 11:21:35 -0700
Return-Path: <owner-linux@cthulhu.engr.sgi.com>
Received: (from daemon@localhost) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) id SAA14111 for linux-list; Thu, 30 May 1996 18:21:30 GMT
Received: from neteng.engr.sgi.com (neteng.engr.sgi.com [192.26.80.10]) by cthulhu.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id LAA14094 for <linux@cthulhu.engr.sgi.com>; Thu, 30 May 1996 11:21:27 -0700
Received: from refugee.engr.sgi.com (refugee.engr.sgi.com [150.166.61.22]) by neteng.engr.sgi.com (950413.SGI.8.6.12/960327.SGI.AUTOCF) via ESMTP id LAA08942; Thu, 30 May 1996 11:21:27 -0700
Received: from refugee.engr.sgi.com by refugee.engr.sgi.com via ESMTP (950413.SGI.8.6.12/940406.SGI.AUTO)
	 id LAA13683; Thu, 30 May 1996 11:17:21 -0700
Message-Id: <199605301817.LAA13683@refugee.engr.sgi.com>
X-Mailer: exmh version 1.6.7 5/3/96
To: Alan Cox <alan@cymru.net>
Cc: lm@gate1-neteng.engr.sgi.com (Larry McVoy), dm@neteng.engr.sgi.com,
        nn@lanta.engr.sgi.com, torvalds@cs.helsinki.fi,
        sparclinux-cvs@caipfs.rutgers.edu, lmlinux@neteng.engr.sgi.com
Subject: Re: linux needs bsd networking stack 
In-reply-to: Message from alan@cymru.net of 30 May 1996 11:06:25 BST
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Thu, 30 May 1996 11:17:21 -0700
From: Steve Alexander <sca@refugee.engr.sgi.com>
Sender: owner-linux@cthulhu.engr.sgi.com
Precedence: bulk

Alan Cox <alan@cymru.net> writes:
>We can't steal it outright. 4.4BSD has abominable problems as is. The FreeBSD 
>people
>have the worst of them fixed but don't have stuff like Vegas and have all the 
>horrible spoofing problems caused by bad timers.

I'm not sure I understand what that means, but I'm pretty sure that Vegas is
not universally agreed to be a good idea, unless I've missed a meeting.

Just to set the record straight, there are people at SGI who don't like either
the BSD or Linux stacks ;->.

The one thing that I will say about the Linux stack is that it is evolving much
more rapidly than any of the public BSD versions are (I spent some time porting
aliases forward from one version to another for a friend, and the improvements
between versions were amazing), so at some point it will probably win on
functionality.

-- Steve



