Netbooting NetBSD in Previous

NeXT Computer, Inc. -> Emulation / Virtualization

Title: Netbooting NetBSD in Previous
Post by: cuby on January 08, 2023, 10:58:12 AM
I'm experimenting with alternative OS's for black hardware again, this time NetBSD.

I checked that the NetBSD network boot loader works on real hardware (it has a problem with the mounted NFS root, but is able to mount the FS). Currently, only non-Turbo slabs are supported.

When running in Previous, the NetBSD bootloader is not receiving packets. When enabling some debugging in Previous, I see that the packets are received by slirp and decoded as DHCP rfc1533_cookie and Previous prepares a reply packet, but this never seems to arrive in the emulation.

When the en_get function in the NetBSD bootloader is executed, Previous complains:

[DMA] Channel Ethernet Receive: Error! DMA not enabled!
[EN] Receiving packet: Error! Receiver overflow (DMA disabled)!

So it seems there's a problem with the DMA behavior?

The NetBSD bootloader ethernet code can be found in this file (http://fxr.watson.org/fxr/source/arch/next68k/stand/boot/en.c?v=NETBSD5#L61,95) (it did not change very much between the versions, none of the versions I tested works and all show a very similar behavior).

I'll try to debug this a bit further, let's see what I can find...
Title: Re: Netbooting NetBSD in Previous
Post by: andreas_g on January 09, 2023, 01:36:05 AM
The Ethernet DMA channels are a bit mysterious! I experienced a similar behavior while doing some experiments with Ethernet speed. Normally the Ethernet receive DMA channel is in chaining mode (after one buffer is full, next buffer is set up automatically from start and stop registers by hardware and an interrupt is generated; the OS then sets new values for start and stop and sets the chaining bit in the DMA CSR). If the speed is too high, the end of the chain is reached (last buffer of chain is full; DMA is automatically disabled). Normally that would not be a problem and the OS would re-enable the DMA channel, but it seems with Ethernet the OS does not re-enable the DMA channel.

The Ethernet channels have some function to repeat buffers in case of collisions or similar errors. But I have not been able to reverse the functionality or find detailled information about it. Maybe it can be reversed from real hardware. It involves the registers named saved_next, saved_limit, saved_start and saved_stop in Previous (see dma.c). Not sure that is the source of the problem, but it might be it.

It might also be worth to check what happens before the DMA channel gets disabled.
Title: Re: Netbooting NetBSD in Previous
Post by: cuby on January 09, 2023, 02:21:27 AM
Quote from: andreas_g on January 09, 2023, 01:36:05 AMThe Ethernet DMA channels are a bit mysterious! I experienced a similar behavior while doing some experiments with Ethernet speed. Normally the Ethernet receive DMA channel is in chaining mode (after one buffer is full, next buffer is set up automatically from start and stop registers by hardware and an interrupt is generated; the OS then sets new values for start and stop and sets the chaining bit in the DMA CSR). If the speed is too high, the end of the chain is reached (last buffer of chain is full; DMA is automatically disabled). Normally that would not be a problem and the OS would re-enable the DMA channel, but it seems with Ethernet the OS does not re-enable the DMA channel.
Thanks for the hints Andreas! Yes, I can't really say I understand what the NetBSD bootloader code tries to do... but it could well be that it's a timing problem which never shows up on real hardware and the NetBSD developers got away with this for the last 17 years or so :). The curse of too little hardware diversity – but maybe this is also the reason why the bootloader doesn't work on Turbo machines (apart from some differences in the network chip if I'm not mistaken).

QuoteThe Ethernet channels have some function to repeat buffers in case of collisions or similar errors. But I have not been able to reverse the functionality or find detailled information about it. Maybe it can be reversed from real hardware. It involves the registers named saved_next, saved_limit, saved_start and saved_stop in Previous (see dma.c). Not sure that is the source of the problem, but it might be it.
Good idea - I'll investigate. I saw some strange effects with unaligned addresses in the saved_xxx pointers, but this was not consistent.

QuoteIt might also be worth to check what happens before the DMA channel gets disabled.
Noted, thanks! :)
Title: Re: Netbooting NetBSD in Previous
Post by: andreas_g on January 09, 2023, 02:35:07 AM
Quote from: cuby on January 09, 2023, 02:21:27 AMYes, I can't really say I understand what the NetBSD bootloader code tries to do... but it could well be that it's a timing problem which never shows up on real hardware and the NetBSD developers got away with this for the last 17 years or so :).

You can play around with timings by changing ENET_IO_DELAY and ENET_IO_SHORT in ethernet.c (lines 312 and 313).
Title: Re: Netbooting NetBSD in Previous
Post by: cuby on January 09, 2023, 03:57:51 AM
Looks like it's a bug in the bootloader after all.

For non-Turbo machines, it sets rxdma->dd_saved_next to 0 in l. 308 of boot.en.c (https://github.com/NetBSD/src/blob/635c4e7ee7570e1e42d915233cface78a330cd48/sys/arch/next68k/stand/boot/en.c#L308).

If I change the line to read rxdma->dd_saved_next = dma_buffers[0]; Previous does netboot! The NetBSD kernel then crashes later during startup when setting up DMA, it seems. More to debug...

I also made a quick video of the boot process (https://multicores.org/NeXT/NetBSD_boot.mov).
Title: Re: Netbooting NetBSD in Previous
Post by: andreas_g on January 09, 2023, 04:16:42 AM
Quote from: cuby on January 09, 2023, 03:57:51 AMLooks like it's a bug in the bootloader after all.
Not necessarily. If it works on real hardware it also needs to work with Previous. What happens if you add following line to dma_enet_write_memory() right before TRY (dma.c, line 835)?

dma[CHANNEL_EN_RX].saved_next = dma[CHANNEL_EN_RX].next
Title: Re: Netbooting NetBSD in Previous
Post by: cuby on January 09, 2023, 04:48:32 AM
Quote from: andreas_g on January 09, 2023, 04:16:42 AMNot necessarily. If it works on real hardware it also needs to work with Previous. What happens if you add following line to dma_enet_write_memory() right before TRY (dma.c, line 835)?

dma[CHANNEL_EN_RX].saved_next = dma[CHANNEL_EN_RX].next
That fixes the problem, thanks! Tested with the binary bootloader from NetBSD 9. So the real hardware seems to perform some automatic init of saved_next, it seems? It sounds a bit strange, if you check the Turbo-specific code in the bootloader, saved_next is explicitly set there (https://github.com/NetBSD/src/blob/635c4e7ee7570e1e42d915233cface78a330cd48/sys/arch/next68k/stand/boot/en.c#L313) (but I'm not sure if this code has ever been tested)...

There's still a problem when the loaded kernel binary size is large (more than ca. 3.5 MB). This results in a "CPU halted" error message from Previous, but I guess that's a different problem...

The NetBSD kernel crash after a successful boot seems to be related to accesses to mmap'ed registers at addresses 0x02004000...003 which don't seem to be emulated (no entry in ioMemTabNEXT.c). The last lines of the log are:

channel SCSI:
DMA CSR write at $02000010 val=$18 PC=$001532f4
DMA from mem to dev
DMA: unknown command!
channel SCSI:
DMA CSR write at $02000010 val=$00 PC=$001532f6
DMA from mem to dev
DMA no command
channel SCSI:
DMA Next write at $02004010 val=$deadbeef PC=$00153454
channel SCSI:
DMA Limit write at $02004014 val=$deadbeef PC=$00153458
Bus error $02004000 PC=$00153464 /Users/me/Projects/NeXT/previous-dev/previous-code/src/ioMem.c at 413
Bus error $02004001 PC=$00153464 /Users/me/Projects/NeXT/previous-dev/previous-code/src/ioMem.c at 423
Bus error $02004002 PC=$00153464 /Users/me/Projects/NeXT/previous-dev/previous-code/src/ioMem.c at 413
Bus error $02004003 PC=$00153464 /Users/me/Projects/NeXT/previous-dev/previous-code/src/ioMem.c at 423
Title: Re: Netbooting NetBSD in Previous
Post by: andreas_g on January 09, 2023, 05:01:28 AM
Can you point me to the palce in the NetBSD sources, where 0x02004000 is accessed? Maybe there is some comment. I suspect there is no real register, but also no bus error. You could try just populating the address with
{ 0x02004000, SIZE_LONG, DMA_Saved_Next_Read, DMA_Saved_Next_Write },
Title: Re: Netbooting NetBSD in Previous
Post by: cuby on January 09, 2023, 05:15:19 AM
Quote from: andreas_g on January 09, 2023, 05:01:28 AMCan you point me to the palce in the NetBSD sources, where 0x02004000 is accessed?
I was trying to find the exact location, but somehow debug symbols don't show as expected in my disassembler. I think it's one of the nd_bsw4 calls in nextdma_setup_curr_regs (.../dev/nextdma.c), but I need to dig through the macros that are used there to find out the referenced addresses.

Here's the NetBSD source... (https://github.com/NetBSD/src/blob/635c4e7ee7570e1e42d915233cface78a330cd48/sys/arch/next68k/dev/nextdma.c#L416)

Adding the line you proposed made it crash when trying to access 0x02004004...007, then 4008 and 400c... adding entries for all of these lets the kernel proceed further.
Another case of strange register aliases as with Plan 9 earlier?

Now Previous hangs (with a spinning wheel of death) after "scsibus0: waiting 2 seconds for devices to settle"...
Title: Re: Netbooting NetBSD in Previous
Post by: andreas_g on January 09, 2023, 05:19:12 AM
Found the place: https://github.com/NetBSD/src/blob/635c4e7ee7570e1e42d915233cface78a330cd48/sys/arch/next68k/dev/nextdma.c#L409 (https://github.com/NetBSD/src/blob/635c4e7ee7570e1e42d915233cface78a330cd48/sys/arch/next68k/dev/nextdma.c#L409)

So it should be like this:
{ 0x02004000, SIZE_LONG, DMA_Saved_Next_Read, DMA_Saved_Next_Write },
{ 0x02004004, SIZE_LONG, DMA_Saved_Limit_Read, DMA_Saved_Limit_Write },
Title: Re: Netbooting NetBSD in Previous
Post by: cuby on January 09, 2023, 05:34:43 AM
Quote from: andreas_g on January 09, 2023, 05:19:12 AMSo it should be like this:
Code Select Expand
{ 0x02004000, SIZE_LONG, DMA_Saved_Next_Read, DMA_Saved_Next_Write },
{ 0x02004004, SIZE_LONG, DMA_Saved_Limit_Read, DMA_Saved_Limit_Write },
There are also writes to 0x020040008 and 0x0204000c which happen in nextdma_setup_cont_regs, I assume l. 464/465 (https://github.com/NetBSD/src/blob/635c4e7ee7570e1e42d915233cface78a330cd48/sys/arch/next68k/dev/nextdma.c#L464)?

However, adding the (I think) obvious entries for these
        { 0x02004008, SIZE_LONG, DMA_Saved_Start_Read, DMA_Saved_Start_Write },
        { 0x0200400c, SIZE_LONG, DMA_Saved_Stop_Read, DMA_Saved_Start_Write },
[/s]
crashes the kernel, the debug output shows bogus values for the saved_xxx registers (see screenshot). [edit] The bogus values are set by the NetBSD code in the lines just above these, so that's probably expected...

Maybe I fail to understand something here...


[edit]Stupid me, copy-and-paste error in the 000c line, but the kernel still hangs at the scsibus0 line...


Title: Re: Netbooting NetBSD in Previous
Post by: cuby on January 09, 2023, 05:51:45 AM
Quote from: cuby on January 09, 2023, 05:34:43 AMthe kernel still hangs at the scsibus0 line...
The last log entries are
DMA Start read at $02004018 val=$deadbee0 PC=$0000fc78
channel SCSI:
DMA Stop read at $0200401c val=$deadbee0 PC=$0000fc84
channel SCSI:
DMA SStart read at $02004008 val=$deadbee0 PC=$0000fd0a
channel SCSI:
DMA SStop read at $0200400c val=$deadbee0 PC=$0000fd14
channel SCSI:
DMA CSR read at $02000010 val=$00 PC=$0000fd9e
IO write at $0211400c val=00 PC=$00014128
IO write at $0211400b val=48 PC=$00014128

No crash this time, though (but maybe busy waiting, thus the wheel of death in Previous? Just guessing...).

[edit] The code seems to hang in the call to kpause (https://github.com/NetBSD/src/blob/635c4e7ee7570e1e42d915233cface78a330cd48/sys/dev/scsipi/scsiconf.c#L253). Setting SCSI_DELAY to 0 then results in

esp0: invalid state: 7 [intr 10, phase(c 3, p 3)]
0x020x_xxxx seems to be NEXT_P_DEV_SPACE, 0x021x_xxxx NEXT_P_DEV_BMAP according to NetBSD's sys/arch/next68k/include/cpu.h?
Title: Re: Netbooting NetBSD in Previous
Post by: andreas_g on January 09, 2023, 12:33:08 PM
Please apply this patch upon the latest revision of branch_softfloat (r1270) and test if it fixes the problem with kpause():

diff -ru a/src/cpu/newcpu.c b/src/cpu/newcpu.c
--- a/src/cpu/newcpu.c 2023-01-05 18:15:58
+++ b/src/cpu/newcpu.c 2023-01-09 19:18:53
@@ -2414,7 +2414,6 @@
  if (regs.stopped)
  return;
  regs.stopped = 1;
- set_special(SPCFLAG_STOP);
 #ifndef WINUAE_FOR_HATARI
  if (cpu_last_stop_vpos >= 0) {
  cpu_last_stop_vpos = vpos;
@@ -2425,7 +2424,6 @@
 static void m68k_unset_stop(void)
 {
  regs.stopped = 0;
- unset_special(SPCFLAG_STOP);
 #ifndef WINUAE_FOR_HATARI
  if (cpu_last_stop_vpos >= 0) {
  cpu_stopped_lines += vpos - cpu_last_stop_vpos;
@@ -6259,11 +6257,7 @@
  run_other_MPUs();
 
  /* We can have several interrupts at the same time before the next CPU instruction */
- /* We must check for pending interrupt and call do_specialties_interrupt() only */
- /* if the cpu is not in the STOP state. Else, the int could be acknowledged now */
- /* and prevent exiting the STOP state when calling do_specialties() after. */
- /* For performance, we first test PendingInterruptCount, then regs.spcflags */
- while ( ( PendingInterrupt.time <= 0 ) && ( PendingInterrupt.pFunction ) && ( ( regs.spcflags & SPCFLAG_STOP ) == 0 ) ) {
+ while ( ( PendingInterrupt.time <= 0 ) && ( PendingInterrupt.pFunction ) ) {
  CALL_VAR(PendingInterrupt.pFunction); /* call the interrupt handler */
  }
 
@@ -6390,11 +6384,7 @@
  run_other_MPUs();
 
  /* We can have several interrupts at the same time before the next CPU instruction */
- /* We must check for pending interrupt and call do_specialties_interrupt() only */
- /* if the cpu is not in the STOP state. Else, the int could be acknowledged now */
- /* and prevent exiting the STOP state when calling do_specialties() after. */
- /* For performance, we first test PendingInterruptCount, then regs.spcflags */
- while ( ( PendingInterrupt.time <= 0 ) && ( PendingInterrupt.pFunction ) && ( ( regs.spcflags & SPCFLAG_STOP ) == 0 ) ) {
+ while ( ( PendingInterrupt.time <= 0 ) && ( PendingInterrupt.pFunction ) ) {
  CALL_VAR(PendingInterrupt.pFunction); /* call the interrupt handler */
  }
  /* Previous: for now we poll the interrupt pins with every instruction.
Title: Re: Netbooting NetBSD in Previous
Post by: cuby on January 09, 2023, 12:48:06 PM
Quote from: andreas_g on January 09, 2023, 12:33:08 PMtest if it fixes the problem with kpause()
It does, the emulation now continues (and then fails with the SCSI error I mentioned above, of course) instead of hanging.

Great, thanks again! :)

Title: Re: Netbooting NetBSD in Previous
Post by: andreas_g on January 09, 2023, 12:53:11 PM
Great! So next let's find out where in the NetBSD code this message is printed:
esp0: invalid state: 7 [intr 10, phase(c 3, p 3)]
Title: Re: Netbooting NetBSD in Previous
Post by: cuby on January 09, 2023, 01:41:04 PM
Quote from: andreas_g on January 09, 2023, 12:53:11 PMGreat! So next let's find out where in the NetBSD code this message is printed:
esp0: invalid state: 7 [intr 10, phase(c 3, p 3)]
Whoops, should've told you right away: in the ncr53c9x.c driver (https://github.com/NetBSD/src/blob/635c4e7ee7570e1e42d915233cface78a330cd48/sys/dev/ic/ncr53c9x.c#L2673). State values are defined in ncr53c9xvar.h (https://github.com/NetBSD/src/blob/635c4e7ee7570e1e42d915233cface78a330cd48/sys/dev/ic/ncr53c9xvar.h#L342).

State 7 is NCR_CMDCOMPLETE (MSG_CMDCOMPLETE received).

I'll turn on debugging now to see a bit more...
[edit] log added, only one disk connected with ID0

Previous complains about "[DMA] Channel SCSI: Starting with 4 residual bytes in DMA buffer."



Title: Re: Netbooting NetBSD in Previous
Post by: cuby on January 09, 2023, 02:45:19 PM
Update:

It seems the condition in l. 2858 of ncr53c9x.c (https://github.com/NetBSD/src/blob/635c4e7ee7570e1e42d915233cface78a330cd48/sys/dev/ic/ncr53c9x.c#L2858) is triggered – if (NCRDMA_ISINTR(sc)) – the state machine starts from scratch and is then confused about the MSG_CMDCOMPLETE state.
Title: Re: Netbooting NetBSD in Previous
Post by: andreas_g on January 10, 2023, 03:22:30 AM
The SCSI controller ist probably the most annoying part to simulate. I have read its datasheet about 100 times and still it fails the diagnostic tests (diagnostics utility). I'm not sure what might be wrong here. I'll have to investigate the sequence before it fails (enable all debugging in esp.c and dma.c and see what it does). The message about residual bytes inside the DMA channel is normally non-critical and just for information purpose. But I'll have a look at that one too.
Title: Re: Netbooting NetBSD in Previous
Post by: ramalhais on February 24, 2023, 07:01:04 PM
I needed to do some small changes to get the standalone netbsd bootloader to work on turbos. Maybe this helps, sorry about the whitespace/formatting changes:

https://github.com/NetBSD/src/compare/trunk...ramalhais:netbsd-src:NeXT
Title: Re: Netbooting NetBSD in Previous
Post by: ramalhais on February 26, 2023, 06:07:06 PM
main changes:

adding nitro and cube turbo on these kinds of IFs (search for NeXT_WARP9):
if (rom_machine_type == NeXT_WARP9 ||

and add the defines (IDs from testing with Previous):
#define NeXT_TURBO_MONO_40   6
#define NeXT_TURBO_COLOR_40   7
#define NeXT_CUBE_TURBO      8
#define NeXT_CUBE_TURBO_40   10


In sys/arch/next68k/include/loadfile_machdep.h fix macro ALIGNENTRY. This fixed SCSI/DMA:
// #define ALIGNENTRY(a)      0
#define   ALIGNENTRY(a)      ((u_long)(a))

Title: Re: Netbooting NetBSD in Previous
Post by: andreas_g on February 27, 2023, 12:51:55 AM
Great work, thank you for sharing! May I suggest you also post this to the NetBSD next68k mailing list? Mr. Tsutsui has been working on NeXT-specific changes recently. He also helped debugging some things in Previous. Maybe there are some synergies and double work can be avoided.
Title: Re: Netbooting NetBSD in Previous
Post by: ramalhais on March 21, 2023, 06:29:44 PM
Quote from: andreas_g on February 27, 2023, 12:51:55 AMGreat work, thank you for sharing! May I suggest you also post this to the NetBSD next68k mailing list? Mr. Tsutsui has been working on NeXT-specific changes recently. He also helped debugging some things in Previous. Maybe there are some synergies and double work can be avoided.

Thanks for pointing that out!
I was a bit afraid that trying to make work-arounds for netbsd in Previous would possibly make it break on my linux port, but it seems like it fixed some issues i was having.
The netbsd bootloader now seems to work on all 040s where before it only worked on turbos.
Somewhere along the changes also got my hanging issues fixed.
But it's hard to keep track of all these issues

Go to top  Forum index