NeXTbus protocol and timing info

NeXT Computer, Inc. -> NeXT Black Hardware

Title: NeXTbus protocol and timing info
Post by: M Paquette on June 21, 2015, 01:30:46 PM
Ed Note: Things found on an old disk one night while scrounging about...

The NeXTbus itself does indeed support a PEAK transfer rate of 100 Mbytes/second WITHIN a burst transfer.  In the real world, though, there is some substantial overhead between the peak transfer rate seen within a burst transfer and the sustained transfer rate that programmed I/O operations see.  These delays include both local bus cycles and NeXTbus cycles needed for a transaction, in addition to the burst data transfer.

The NeXTbus specification permits burst transfers of 4, 8, 16, or 32 words after arbitration and bus acquisition and before an acknowledge cycle.  The current revision of the NBIC chip supports 4 word burst transfers only.

Within a NeXTbus burst transfer, both the bus master and slave act to control the rate of transfer using the DRQ* and TM0* NeXTbus lines.  These signals can act to delay following cycles of a burst transfer.


NeXTbus Transactions

Ignoring the localbus for now, here are some best case timings for transfers.  These ignore possible arbitration, contention, and flow control issues.

The complete NeXTbus burst transfer for a 4 word burst includes a start clock cycle, 2 data transfer clock sysles in which 4 words are clocked out by DSTB*, a cycle in which the bus master stops driving AD31-0* and DSTB*, followed by a cycle in which the slave drives it's status code on TM1-0* and asserts ACK*.  The master samples TM1-0* at the end of this cycle.  

The transaction thus takes 5 clocks at 12.5 Mhz to move 4 words, taking a total of 400 nsec.

A single word write transfer consists of a start clock cycle followed by two or more cycles in which the data is presented on the bus.  The data is presented until an acknowledge cycle is generated by the slave device.  Transferring 1 word of data thus takes at least 3 clocks at 12.5 Mhz, taking 240 nsec.  Timings for read cycles are similar.


Localbus Transactions

Before data can move over the NeXTbus, it passes through a localbus transaction of some form to enter the NBIC chip.  After arriving at the destination board, the data has to pass through another NBIC chip and be driven onto a localbus.  There is a localbus transaction protocol that contributes to the total transfer time.

In the case of a burst write over the NeXTbus, the transaction starts with a burst write to a local slave NBIC.  Here are the gory details:

The first clock places the desired address on LAD31-0 and drives R/W* low, along with asserting the transfer size.  ASIN* latches this information into the NBIC.  In clock 2, addresses on the bus go invalid, and the local bus master asserts GBCYC*, telling the NBIC to start a NeXTbus transaction as a global master, and BREQ*, indicating a burst transfer.  In clock 3, the NBIC asserts GMASTER*, indicating that it is performing a NextBus transaction as a global master, and BACK*, acknowledging the burst request.  In clock 4, the localbus places the 1st word of data on the LAD bus.  The NBIC asserts STERM*, indicating that the data will be latched on the next falling clock edge.  In clock 5, the NBIC latches the first word of data.  In clock 6, the localbus asserts the 2nd word of data, and the NBIC asserts STERM* again.  In clock 7, the NBIC latches the second word of data.  This is repeated for the next 2 words of data, ending at clock 12.  In clock 13, the NBIC deasserts the GMASTER* signal.

This gives us a total of 13 clocks at 25 Mhz (The CPU clock is used), to move 4 words of data into the NBIC, taking 520 nsec.

The NeXTbus burst transfer ends at another NBIC, which will be performing a burst write onto the localbus as bus master.  This transaction will be clocked by the NBIC LCLK, at 16.67 Mhz.

The transaction takes a total of 12 clocks.  In the first clock, the NBIC has gained bus ownership of the localbus through the local bus arbitration mechanism, but has not started a transaction.  Near the end of clock 1, the NBIC asserts BGACK*, beginning the transaction.  The NBIC also asserts FB* and GSLAVE*, indicating to the local bus that it is acting as a NeXTbus slave, places values on the LAD and SIZ1-0 lines, and drives R/W* low, to indicate a write operation.  In clock 2, the values on the LAD, R/W*, and SIZ[1:0] lines become valid, and the NBIC asserts BREQ*, indicating a burst operation.  AS* is asserted near the end of the cycle to latch address and size information into the local bus slave.  In clock 3, the address goes invalid, and the first word of data appears on the bus near the end of the cycle.  The data remains asserted through clock 4.  In clock 5, the local slave device must assert BACK*, acknowledging the burst transfer, and STERM*, indicating the data will be latched on the next falling clock edge.  In clocks 6, 7, and 8, the following words of data are latched on the falling clock edge.  In clock 8, the NBIC deasserts BREQ*, indicating the end of the burst.  In clock 9, the local slave deasserts BACK*, and the NBIC deasserts AS* and DS*.  The local slave deasserts STERM*  In clock 10, the NBIC  deasserts FB*, R/W*, and the SIZ1-0 lines, and data on the LAD bus goes invalid.  In this period, the NBIC generates it's ACK* and status code over the NeXTbus, as NeXTbus slave.  In clock 12, the NBIC stops driving values on all the transaction lines and deasserts BGACK* and GSLAVE*, ending the NBIC transaction cycle.

This gives us a total of 12 clocks at 16.67 Mhz to move 4 words of data out of the NBIC, taking about 720 nsec.

Total Transaction Times

Our 4 word burst transfer takes 520 nsec to move from the CPU card local bus to the NBIC, 400 nsec to move from the master NBIC to the slave NBIC over the NeXTbus, and 720 nsec to move from the slave NBIC onto the destination local bus.  The total elapsed time is 1640 nsec, for a transfer rate of about 9.76 Mbytes/second using burst writes.

If we enable store and forward transfers in the CPU board NBIC, we can interleave writes to the NBIC with the NBIC transfers over the NeXTbus.  This reduces the total elapsed time to 1120 nsec, within a sequence of burst transfers.  This gives us a transfer rate of about 14.3 Mbytes/second.

Repeating the timing exercises for a single word read over the NeXTbus gives us 540 nsec for an NBIC local bus master to read 1 word, 240 nsec to move the word over the NeXTbus, and 320 nsec to read the word from the CPU board NBIC.  The total elapsed time to read 1 word is  1100 nsec, or a transfer rate of 3.6 Mbytes/second.  There is actually a good bit of overlap in the read process between the NeXTbus and CPU localbus, and the actual elapsed time is more like 780 nsec, for a transfer rate of 5.2 Mbytes/second


A Sample Burst Write Routine

Here is a sample burst write routine, used to pump data over the NeXTbus using programmed I/O.  When used to move data over the NeXTbus to a board capable of handling sustained burst writes, I measured a sustained write transfer rate of (suprise!) 14.3 Mbytes/second.  When performing non-burst reads from the NeXTbus board, I measured a transfer rate of 5.1 Mbytes/second.

//
//   Perform burst transfers using the 68040 move16 opcode.
//   src and dst pointers must be aligned on 16 byte boundries,
//   and the transfer count must be given as a multiple of 16.
//   The transfer count must have less than 20 significant bits,
//   or be under 1 Mbyte, due to restrictions of the dbra instruction.
//
//   No error checking is done.  Alignment is silently enforced.
//
   .text
_burst_copy:   .globl _burst_copy
   movl   sp@(4), d0   | Src addr
   andl   #~15, d0   | silently forced to 16 byte alignment
   movl   d0, a0

   movl   sp@(8), d0   | Dst addr
   andl   #~15, d0   | silently forced to 16 byte alignment
   movl   d0, a1

   movl   sp@(12), d0   | Byte count
   lsrl   #4, d0      | divided by 16 to a line count
   subql   #1, d0      | and pre-decremented for dbra loop

1:
//   move16   a0@+, a1@+   | Move 16 bytes
   .long   0xf6209000
   dbra   d0, 1b      | test and decrement loop counter
   
   rts
Title: Re: NeXTbus protocol and timing info
Post by: Rob Blessin Black Hole on July 09, 2015, 02:59:37 PM
Quote from: "M Paquette"Ed Note: Things found on an old disk one night while scrounging about...

The NeXTbus itself does indeed support a PEAK transfer rate of 100 Mbytes/second WITHIN a burst transfer.  In the real world, though, there is some substantial overhead between the peak transfer rate seen within a burst transfer and the sustained transfer rate that programmed I/O operations see.  These delays include both local bus cycles and NeXTbus cycles needed for a transaction, in addition to the burst data transfer.

The NeXTbus specification permits burst transfers of 4, 8, 16, or 32 words after arbitration and bus acquisition and before an acknowledge cycle.  The current revision of the NBIC chip supports 4 word burst transfers only.

Within a NeXTbus burst transfer, both the bus master and slave act to control the rate of transfer using the DRQ* and TM0* NeXTbus lines.  These signals can act to delay following cycles of a burst transfer.


NeXTbus Transactions

Ignoring the localbus for now, here are some best case timings for transfers.  These ignore possible arbitration, contention, and flow control issues.

The complete NeXTbus burst transfer for a 4 word burst includes a start clock cycle, 2 data transfer clock sysles in which 4 words are clocked out by DSTB*, a cycle in which the bus master stops driving AD31-0* and DSTB*, followed by a cycle in which the slave drives it's status code on TM1-0* and asserts ACK*.  The master samples TM1-0* at the end of this cycle.  

The transaction thus takes 5 clocks at 12.5 Mhz to move 4 words, taking a total of 400 nsec.

A single word write transfer consists of a start clock cycle followed by two or more cycles in which the data is presented on the bus.  The data is presented until an acknowledge cycle is generated by the slave device.  Transferring 1 word of data thus takes at least 3 clocks at 12.5 Mhz, taking 240 nsec.  Timings for read cycles are similar.


Localbus Transactions

Before data can move over the NeXTbus, it passes through a localbus transaction of some form to enter the NBIC chip.  After arriving at the destination board, the data has to pass through another NBIC chip and be driven onto a localbus.  There is a localbus transaction protocol that contributes to the total transfer time.

In the case of a burst write over the NeXTbus, the transaction starts with a burst write to a local slave NBIC.  Here are the gory details:

The first clock places the desired address on LAD31-0 and drives R/W* low, along with asserting the transfer size.  ASIN* latches this information into the NBIC.  In clock 2, addresses on the bus go invalid, and the local bus master asserts GBCYC*, telling the NBIC to start a NeXTbus transaction as a global master, and BREQ*, indicating a burst transfer.  In clock 3, the NBIC asserts GMASTER*, indicating that it is performing a NextBus transaction as a global master, and BACK*, acknowledging the burst request.  In clock 4, the localbus places the 1st word of data on the LAD bus.  The NBIC asserts STERM*, indicating that the data will be latched on the next falling clock edge.  In clock 5, the NBIC latches the first word of data.  In clock 6, the localbus asserts the 2nd word of data, and the NBIC asserts STERM* again.  In clock 7, the NBIC latches the second word of data.  This is repeated for the next 2 words of data, ending at clock 12.  In clock 13, the NBIC deasserts the GMASTER* signal.

This gives us a total of 13 clocks at 25 Mhz (The CPU clock is used), to move 4 words of data into the NBIC, taking 520 nsec.

The NeXTbus burst transfer ends at another NBIC, which will be performing a burst write onto the localbus as bus master.  This transaction will be clocked by the NBIC LCLK, at 16.67 Mhz.

The transaction takes a total of 12 clocks.  In the first clock, the NBIC has gained bus ownership of the localbus through the local bus arbitration mechanism, but has not started a transaction.  Near the end of clock 1, the NBIC asserts BGACK*, beginning the transaction.  The NBIC also asserts FB* and GSLAVE*, indicating to the local bus that it is acting as a NeXTbus slave, places values on the LAD and SIZ1-0 lines, and drives R/W* low, to indicate a write operation.  In clock 2, the values on the LAD, R/W*, and SIZ[1:0] lines become valid, and the NBIC asserts BREQ*, indicating a burst operation.  AS* is asserted near the end of the cycle to latch address and size information into the local bus slave.  In clock 3, the address goes invalid, and the first word of data appears on the bus near the end of the cycle.  The data remains asserted through clock 4.  In clock 5, the local slave device must assert BACK*, acknowledging the burst transfer, and STERM*, indicating the data will be latched on the next falling clock edge.  In clocks 6, 7, and 8, the following words of data are latched on the falling clock edge.  In clock 8, the NBIC deasserts BREQ*, indicating the end of the burst.  In clock 9, the local slave deasserts BACK*, and the NBIC deasserts AS* and DS*.  The local slave deasserts STERM*  In clock 10, the NBIC  deasserts FB*, R/W*, and the SIZ1-0 lines, and data on the LAD bus goes invalid.  In this period, the NBIC generates it's ACK* and status code over the NeXTbus, as NeXTbus slave.  In clock 12, the NBIC stops driving values on all the transaction lines and deasserts BGACK* and GSLAVE*, ending the NBIC transaction cycle.

This gives us a total of 12 clocks at 16.67 Mhz to move 4 words of data out of the NBIC, taking about 720 nsec.

Total Transaction Times

Our 4 word burst transfer takes 520 nsec to move from the CPU card local bus to the NBIC, 400 nsec to move from the master NBIC to the slave NBIC over the NeXTbus, and 720 nsec to move from the slave NBIC onto the destination local bus.  The total elapsed time is 1640 nsec, for a transfer rate of about 9.76 Mbytes/second using burst writes.

If we enable store and forward transfers in the CPU board NBIC, we can interleave writes to the NBIC with the NBIC transfers over the NeXTbus.  This reduces the total elapsed time to 1120 nsec, within a sequence of burst transfers.  This gives us a transfer rate of about 14.3 Mbytes/second.

Repeating the timing exercises for a single word read over the NeXTbus gives us 540 nsec for an NBIC local bus master to read 1 word, 240 nsec to move the word over the NeXTbus, and 320 nsec to read the word from the CPU board NBIC.  The total elapsed time to read 1 word is  1100 nsec, or a transfer rate of 3.6 Mbytes/second.  There is actually a good bit of overlap in the read process between the NeXTbus and CPU localbus, and the actual elapsed time is more like 780 nsec, for a transfer rate of 5.2 Mbytes/second


A Sample Burst Write Routine

Here is a sample burst write routine, used to pump data over the NeXTbus using programmed I/O.  When used to move data over the NeXTbus to a board capable of handling sustained burst writes, I measured a sustained write transfer rate of (suprise!) 14.3 Mbytes/second.  When performing non-burst reads from the NeXTbus board, I measured a transfer rate of 5.1 Mbytes/second.

//
//   Perform burst transfers using the 68040 move16 opcode.
//   src and dst pointers must be aligned on 16 byte boundries,
//   and the transfer count must be given as a multiple of 16.
//   The transfer count must have less than 20 significant bits,
//   or be under 1 Mbyte, due to restrictions of the dbra instruction.
//
//   No error checking is done.  Alignment is silently enforced.
//
   .text
_burst_copy:   .globl _burst_copy
   movl   sp@(4), d0   | Src addr
   andl   #~15, d0   | silently forced to 16 byte alignment
   movl   d0, a0

   movl   sp@(8), d0   | Dst addr
   andl   #~15, d0   | silently forced to 16 byte alignment
   movl   d0, a1

   movl   sp@(12), d0   | Byte count
   lsrl   #4, d0      | divided by 16 to a line count
   subql   #1, d0      | and pre-decremented for dbra loop

1:
//   move16   a0@+, a1@+   | Move 16 bytes
   .long   0xf6209000
   dbra   d0, 1b      | test and decrement loop counter
   
   rts

Thank you Mike , we always appreciate the detailed information , I know there are a couple of garage Cube expansion board projects going on out there and not a lot of detail about NBIC chip specifications so this will help them a lot. I'll point them at this post, I still have a few NBIC chips in the original NeXT packaging .

Go to top  Forum index