GTI Mirabox, Netgear RN102/RN104, Netgear RN2120 Installation & Kernel Upgrade (Linux-6.8.7) (2024)

Welcome! Log In Create A New Profile

Home > Debian > Topic > Page 5

Posted by bodhi

Forum List Message List New Topic

bodhi


March 02, 2024 03:02PM
Admin
Registered: 13 years ago
Posts: 18,712

Hi Trond,

> most likely because 'mkimage' on the 2nd box do
> not support 'MODULES=dep':

> So it seems 'mkimage' got a face lift in Debian
> Bookworm?

Probably not that reason. Because 'MODULES=dep" was in Debian 11.x

This is one of my home media NAS boxes, still running Debian 11.

root@HomeMedia1:~# cat /etc/debian_version 11.1root@HomeMedia1:~# grep -i modules /etc/initramfs-tools/initramfs.conf # MODULES: [ most | netboot | dep | list ]# dep - Try and guess which modules to load.# netboot - Add the base modules, network modules, but skip block devices.# list - Only include modules from the 'additional modules' listMODULES=most

Perhpas "dep" did not work before. I'll try a test to see if can recreate your observation.

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)

Reply Quote

bodhi


March 02, 2024 03:10PM
Admin
Registered: 13 years ago
Posts: 18,712

Hi Trond,

> Perhpas "dep" did not work before. I'll try a test
> to see if can recreate your observation.

I've verified that "MODULES=dep" works in Debian 11.

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)

Reply Quote

bodhi


March 03, 2024 02:57PM
Admin
Registered: 13 years ago
Posts: 18,712

Here is kernel linux-6.7.5-mvebu-370xp-tld-3. The objective is to:

- Configured as SMP kernel, to add support for Netgear RN2120 (Armada XP 2-cores)
- I2S and SPDIF still configured in the kernel (these were removed in linux-6.7.5-mvebu-370xp-tld-2).
- The DTB for Netgear RN120 has Audio Controller nodes removed. This is for testing the time lag issue in RN102 (see if it will help or hurt).

Download at Dropbox

linux-6.7.5-mvebu-370xp-tld-3-bodhi.tar.bz2

md5:a44771dd7d586d2171d9c68aeb19e96esha256:d9e82ba31cf2ddc123aa882bee0dd4c4c05ab5095eccc7546c6c8d33fb9f100c

This tarball contains 5 files

linux-image-6.7.5-mvebu-370xp-tld-3_3_armhf.deblinux-headers-6.7.5-mvebu-370xp-tld-3_3_armhf.debzImage-6.7.5-mvebu-370xp-tld-3config-6.7.5-mvebu-370xp-tld-3linux-dtb-6.7.5-mvebu-370xp-tld-3.tar

Please do a full kernel installation (see instruction for kernel linux-6.5.7-mvebu-370xp-tld-1 up top).

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)

Reply Quote

tme


March 03, 2024 03:55PM
Registered: 7 years ago
Posts: 165

Hi bodhi,

Thank you very much for your efforts! I have successfully installed Linux kernel version 'linux-6.7.5-mvebu-370xp-tld-3' on a Readynas RN102. It works a expected.

To check the system clock, I disabled the 'crontab' job that updates the system clock from the hardware clock once a minute. I also made the RN102 trust my laptop so that 'ssh' works without a password. On the laptop I inquired the system clock twice, with a 5 minutes delay in between:

$ uname -aLinux rn102 6.7.5-mvebu-370xp-tld-3 #1 SMP PREEMPT Thu Feb 29 19:03:59 PST 2024 armv7l GNU/Linux$ cat /etc/debian_version 12.4$ ssh rn102 date -R && sleep 300 && ssh rn102 date -RSun, 03 Mar 2024 22:41:39 +0100Sun, 03 Mar 2024 22:46:14 +0100

So the rn102's system clock lagged 25 seconds in 5 minutes. That's 5 seconds per minute.

Regards,
Trond Melen

Reply Quote

bodhi


March 03, 2024 04:18PM
Admin
Registered: 13 years ago
Posts: 18,712

Hi Trond,

6.7.5-mvebu-370xp-tld-3

> So the rn102's system clock lagged 25 seconds in 5
> minutes. That's 5 seconds per minute.

6.7.5-mvebu-370xp-tld-2

> It used to be lagging 6 to 7 seconds
> per minute. Now it's lagging just above 5 seconds
> per minute.

So it proved that the kernel configs for I2S and SPIF can remain, we just need to remove the audio controller nodes in the DTS to see the difference.

And also perhaps later you can tell if SMP kernel performance is about the same as non-SMP in some tasks. Or if the difference is insignificant. The reason I used non-SMP was from danitool's testing on OpenWrt as credited in the 1st post (there are some other tweaks for Armada 370 SoC by danitool that I have not looked into)

But SMP on a single CPU should only incur some very small overhead, IMO.

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)

Reply Quote

bodhi


March 13, 2024 06:23PM
Admin
Registered: 13 years ago
Posts: 18,712

Kernel linux-6.7.5-mvebu-370xp-tld-3 package has been uploaded. See 1st post for download link.

Note that this is the same as the working kernel I've uploaded before in this thread.

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)

Edited 1 time(s). Last edit at 03/13/2024 06:24PM by bodhi.

Reply Quote

bodhi


March 19, 2024 05:30PM
Admin
Registered: 13 years ago
Posts: 18,712

pczerepaniak & Trond,

I've tried WOL on my Mirabox and it did not work either.

Note that in /etc/init.d/halt the NETDOWN is not clear. So the halt action would shutdown the network.

halt -d -f $netdown $poweroff $hddown

So I explicitly modified this initscript to do

halt -d -f $poweroff $hddown

Even with the above change, WOL does nothing.

Translate all of the above to systemd scripts if you are running it.

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)

Reply Quote

bodhi


May 21, 2024 02:47PM
Admin
Registered: 13 years ago
Posts: 18,712

Kernel linux-6.8.7-mvebu-370xp-tld-1-bodhi.tar.bz2 package has been uploaded. Please see 1st post for download link.

Hi Trond,

I did not implement any patch to improve the timing problem on the RN102/104 in this kernel. I'm thinking we should try to modify the DTS first, so I will post the working DTB here later.

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)

Reply Quote

bodhi


May 22, 2024 01:54PM
Admin
Registered: 13 years ago
Posts: 18,712

Hi Trond,

The attached DTBs have the watchdog timer removed. The RN102 also has audio controller and SPDIF removed. Please try the timing test to see if there is any difference.

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)

Reply Quote

Attachments:
open | download - armada-370-netgear-rn102.dtb (14.8KB)
open | download - armada-370-netgear-rn104.dtb (15.2KB)

tme


May 27, 2024 11:59AM
Registered: 7 years ago
Posts: 165

Thanks bodhi,

for continuing supporting the Netgear ReadyNAS! Following the procedure in the 1st post, I upgraded one of my RN102 boxes to Linux kernel version 6.8.7. The box booted fine and I could log in using 'ssh' and copy files using 'scp', so the new kernel basically works.

Examining the 'dmesg' output (attached), I discovered two debug traces, though, after this line:

[ 28.050207] BUG: scheduling while atomic: mdadm/1710/0x00000002

'mdadm' indicates an issue with 'raid', but the '/home' directory mounted OK:

$ lsblkNAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTSsda 8:0 1 1.8T 0 disk `-sda1 8:1 1 1.8T 0 part `-md0 9:0 0 1.8T 0 raid1 /homesdb 8:16 1 1.8T 0 disk `-sdb1 8:17 1 1.8T 0 part `-md0 9:0 0 1.8T 0 raid1 /homesdc 8:32 0 149.1G 0 disk `-sdc1 8:33 0 149G 0 part /mtdblock0 31:0 0 1.5M 0 disk mtdblock1 31:1 0 128K 1 disk mtdblock2 31:2 0 6M 0 disk mtdblock3 31:3 0 4M 0 disk mtdblock4 31:4 0 116M 0 disk $

The issue repeated on a reboot. I assume we should try to sort out the reported BUG before experimenting with a new DTB?

Regards,
Trond Melen

Reply Quote

Attachments:
open | download - dmesg.lst (39.5KB)

bodhi


May 27, 2024 04:13PM
Admin
Registered: 13 years ago
Posts: 18,712

Hi Trond,

Not sure about this kernel bug (I'm not using RAID on Mirabox, obviously)!

There is an error in your bootargs:

Quote

[ 0.000000] Kernel command line: console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) earlyprintk=serial mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) reason=normal bdtype=rn102
[ 0.000000] Unknown kernel command line parameters "reason=normal bdtype=rn102", will be passed to user space.

Notice that the parameters are repeated. Sometime a badly formatted bootargs could affect rootfs mounting. However, the unknown param like "reason=normal bdtype=rn102" are OK, i.e. ignored by the kernel.

Perhaps you should boot with serial console and fix the bootargs first, in case it is causing a side effect.

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)

Reply Quote

tme


May 28, 2024 02:03AM
Registered: 7 years ago
Posts: 165

Hi bodhi,

Where does the extra environment variables come from? As I understand, these are the only environment variables that are actually used by 'U-Boot' when booting:

$ sudo fw_printenv | egrep -i 'bootcmd=|sata_set_bootargs=|sata_boot=|mtdparts=|load_uimage=|load_uinitrd=' | sortbootcmd=ide reset; run sata_bootcmd; resetload_uimage=ext2load ide 0:1 0x2000000 /boot/uImageload_uinitrd=ext2load ide 0:1 0x3000000 /boot/uInitrdmtdparts=mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)sata_boot=run load_uimage; if run load_uinitrd; then bootm 0x2000000 0x3000000; else bootm 0x2000000; fisata_bootcmd=run sata_set_bootargs; run sata_bootsata_set_bootargs=setenv bootargs console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 $mtdparts earlyprintk=serial$

And wheat about those passed to the user environment?

$ sudo fw_printenv | egrep -i 'reason|normal|bdtype|rn102'SKUNum=RN102Startup=Normal

Are they hard coded into the stock 'U-Boot'? Attached the full output of 'fw_printenv'.

Regards,
Trond Melen

Reply Quote

Attachments:
open | download - fw_printenv.lst (2.4KB)

bodhi


May 28, 2024 02:26PM
Admin
Registered: 13 years ago
Posts: 18,712

Hi Trond,

> Where does the extra environment variables come
> from? As I understand, these are the only
> environment variables that are actually used by
> 'U-Boot' when booting:

sata_bootcmd=run sata_set_bootargs; run sata_bootsata_set_bootargs=setenv bootargs console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 $mtdparts earlyprintk=serial

The above look OK.

> And wheat about those passed to the user
> environment?
>

> $ sudo fw_printenv | egrep -i> 'reason|normal|bdtype|rn102'> SKUNum=RN102> Startup=Normal>

>
> Are they hard coded into the stock 'U-Boot'?
> Attached the full output of 'fw_printenv'.

I think this must be the case where stock u-boot is buggy, and has problem parsing the mtdpart so it appends these extra bootargs. The result is a wrong mtdparts being passed into the kernel. So the current MTDs in your system is from the DTB definition. Not from the bootargs.

[ 3.184866] 5 fixed-partitions partitions found on MTD device pxa3xx_nand-0[ 3.191969] Creating 5 MTD partitions on "pxa3xx_nand-0":[ 3.197597] 0x000000000000-0x000000180000 : "u-boot"[ 3.206874] 0x000000180000-0x0000001a0000 : "u-boot-env"[ 3.215802] 0x000000200000-0x000000800000 : "uImage"[ 3.225242] 0x000000800000-0x000000c00000 : "minirootfs"[ 3.234744] 0x000000c00000-0x000008000000 : "ubifs"

The mtdparts should be corrected as:

mtdparts=mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)

But currently you cannot set env in Debian using fw_setenv, because the envs are read-only in the DTB definition.

So it must be set in serial console as followed:

setenv mtdparts 'mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)'

======

Later, I'll update the DTS to remove the read-only property for this RN102 and RN104. If we cannot set them then they are not much use in Debian.

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)

Reply Quote

tme


May 28, 2024 04:57PM
Registered: 7 years ago
Posts: 165

Hi bodhi,

Thanks for your advice! The BUG message in the 'dmesg' output is now gone. Attached output on the serial console.

There are still some warnings printed in bold on the serial console:

[ 1.990632] gpio gpiochip0: Static allocation of GPIO base is deprecated, use dynamic allocation.[ 2.000817] debugfs: Directory 'd0018100.gpio' with parent 'regmap' already present![ 2.010570] gpio gpiochip1: Static allocation of GPIO base is deprecated, use dynamic allocation.[ 2.020570] debugfs: Directory 'd0018140.gpio' with parent 'regmap' already present![ 2.029418] gpio gpiochip2: Static allocation of GPIO base is deprecated, use dynamic allocation.[ 6.393683] mtdblock: MTD device 'u-boot' is NAND, please consider using UBI block devices instead.[ 6.406327] mtdblock: MTD device 'uImage' is NAND, please consider using UBI block devices instead.[ 6.420679] mtdblock: MTD device 'u-boot-env' is NAND, please consider using UBI block devices instead.[ 6.462804] mtdblock: MTD device 'minirootfs' is NAND, please consider using UBI block devices instead.[ 6.491866] mtdblock: MTD device 'ubifs' is NAND, please consider using UBI block devices instead.[ 23.088713] mtdblock: MTD device 'u-boot' is NAND, please consider using UBI block devices instead.[ 23.317898] mtdblock: MTD device 'u-boot-env' is NAND, please consider using UBI block devices instead.[ 23.404877] mtdblock: MTD device 'uImage' is NAND, please consider using UBI block devices instead.[ 23.484129] mtdblock: MTD device 'minirootfs' is NAND, please consider using UBI block devices instead.[ 23.590980] mtdblock: MTD device 'ubifs' is NAND, please consider using UBI block devices instead.

The setup variable 'mtdparts' is still reported differently by 'printenv' on the console

Marvell>> printenv mtdparts mtdparts=mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)

and 'dmesg' by 'ssh':

[ 0.000000] Kernel command line: console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) earlyprintk=serial mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) reason=normal bdtype=rn102

The setup variables passed to the user environment persists.

Regards,
Trond Melen

Reply Quote

Attachments:
open | download - serial.lst (38.9KB)
open | download - dmesg.lst (29.2KB)

bodhi


May 28, 2024 06:37PM
Admin
Registered: 13 years ago
Posts: 18,712

Hi Trond,

> Thanks for your advice! The BUG message in the
> 'dmesg' output is now gone. Attached output on the
> serial console.

Cool! It must have been some side effects that we could not see.

>
> There are still some warnings printed in bold on
> the serial console:

> [ 1.990632] gpio gpiochip0: Static allocation
> of GPIO base is deprecated, use dynamic
> allocation.
> [ 2.000817] debugfs: Directory 'd0018100.gpio'
> with parent 'regmap' already present!
> [ 2.010570] gpio gpiochip1: Static allocation
> of GPIO base is deprecated, use dynamic
> allocation.
> [ 2.020570] debugfs: Directory 'd0018140.gpio'
> with parent 'regmap' already present!
> [ 2.029418] gpio gpiochip2: Static allocation
> of GPIO base is deprecated, use dynamic
> allocation.

The above warnings are OK. What we have in the current DTS (from mainline) are the old GPIO Static allocation. They just want to remind people to update. These are warning only, it's OK for now. I'll look at this when building next MVEBU kernel version.

> [ 6.393683] mtdblock: MTD device 'u-boot' is
> NAND, please consider using UBI block devices
> instead.
> [ 6.406327] mtdblock: MTD device 'uImage' is
> NAND, please consider using UBI block devices
> instead.
> [ 6.420679] mtdblock: MTD device 'u-boot-env'
> is NAND, please consider using UBI block devices
> instead.
> [ 6.462804] mtdblock: MTD device 'minirootfs'
> is NAND, please consider using UBI block devices
> instead.
> [ 6.491866] mtdblock: MTD device 'ubifs' is
> NAND, please consider using UBI block devices
> instead.
> [ 23.088713] mtdblock: MTD device 'u-boot' is
> NAND, please consider using UBI block devices
> instead.
> [ 23.317898] mtdblock: MTD device 'u-boot-env'
> is NAND, please consider using UBI block devices
> instead.
> [ 23.404877] mtdblock: MTD device 'uImage' is
> NAND, please consider using UBI block devices
> instead.
> [ 23.484129] mtdblock: MTD device 'minirootfs'
> is NAND, please consider using UBI block devices
> instead.
> [ 23.590980] mtdblock: MTD device 'ubifs' is
> NAND, please consider using UBI block devices
> instead.
> [/code]

Same kind of warnings. Stock MTD layout have raw NAND (i.e. to store Firmware). When we use mtd block device, the warnings are printed out to tell people to use UBI (as it should). A typical kernel will not print this because mtd block device module are not running immediately when the kernel booting. But our kernel does have this builtin.

I was going to move it out to initramfs to make the kernel log less noisy and shrink uImage but forgot about it!

> The setup variable 'mtdparts' is still reported
> differently by 'printenv' on the console

> and 'dmesg' by 'ssh':
>

> [ 0.000000] Kernel command line:> console=ttyS0,115200 root=LABEL=rootfs> rootdelay=10> mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)> earlyprintk=serial> mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)> reason=normal bdtype=rn102>

>
> The setup variables passed to the user environment
> persists.

This might be related to the syntax we use to set up the envs. Stock u-boot could be buggy in dereferencing the $mtdparts inside another setenv.

Try using single quote, and { brackets as below

setenv mtdparts 'mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)'setenv sata_set_bootargs 'setenv bootargs console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 ${mtdparts} earlyprintk=serial'

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)

Edited 1 time(s). Last edit at 05/28/2024 06:38PM by bodhi.

Reply Quote

tme


May 29, 2024 01:45AM
Registered: 7 years ago
Posts: 165

Hi bodhi,

The full serial console is attached. This is the U-Boot dialog:

Marvell>> printenv mtdparts sata_set_bootargs mtdparts=mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)sata_set_bootargs=setenv bootargs console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 $mtdparts earlyprintk=serialMarvell>> setenv mtdparts 'mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)'Marvell>> setenv sata_set_bootargs 'setenv bootargs console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 ${mtdparts} earlyprintk=serial'Marvell>> printenv mtdparts sata_set_bootargs mtdparts=mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)sata_set_bootargs=setenv bootargs console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 ${mtdparts} earlyprintk=serial

The BUG message is back:

[ 29.948156][ T110] BUG: scheduling while atomic: kworker/0:2/110/0x00000002

Should the 'mtdparts' setup parameter include "mtdparts=pxa3xx_nand-0" or "mtdparts=armada-nand", or maybe it dosn't matter?

Regards,
Trond Melen

Reply Quote

Attachments:
open | download - serial.lst (52.5KB)

bodhi


May 30, 2024 02:17PM
Admin
Registered: 13 years ago
Posts: 18,712

Hi Trond,

Quote

Marvell>> printenv mtdparts sata_set_bootargs mtdparts=mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)

> The BUG message is back:
>

> [ 29.948156][ T110] BUG: scheduling while> atomic: kworker/0:2/110/0x00000002> 

The mtdparts env was not set correctly, so you're actually back to the previous boot setup.

So set it in serial console

setenv mtdparts 'mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)'setenv sata_set_bootargs 'setenv bootargs console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 ${mtdparts} earlyprintk=serial'

and then save it (for this boot, it is not necessary to save it, but for the next boot so it can be booted withtout interruption).

saveenvboot

> Should the 'mtdparts' setup parameter include
> "mtdparts=pxa3xx_nand-0" or
> "mtdparts=armada-nand", or maybe it dosn't matter?

Yes, it must be pxa3xx_nand-0 for the NAND driver to be found and run in the kernel.

If setting to the correct NAND driver pxa3xx_nand and the the kernel BUG does not show up, then I think armada_nand as a non-existent driver might have something to do with it.

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)

Reply Quote

tme


June 06, 2024 02:02PM
Registered: 7 years ago
Posts: 165

Hi bodhi,

I'm sorry for this slow response! And sorry for having forgotten that modifications to the U-Boot environment are volatile until saved.

I've experimented a little. The BUG-line come and go. Sometimes there is no such line, sometimes it refers to "mdadm" and sometimes it refers to "kworker". These lines are from 10 reboots with no modifications to the U-Boot environment in between:

[ 29.788241] BUG: scheduling while atomic: kworker/0:2/110/0x00000002na.[ 30.028215] BUG: scheduling while atomic: kworker/0:2/110/0x00000002[ 27.944406] BUG: scheduling while atomic: mdadm/1710/0x00000002[ 27.628423][ T110] BUG: scheduling while atomic: kworker/0:2/110/0x00000002 (Got stuck here! Power cycle required to reset.)na.[ 29.948205][ T8] BUG: scheduling while atomic: kworker/0:0/8/0x00000002[ 29.868228][ T110] BUG: scheduling while atomic: kworker/0:2/110/0x00000002[ 29.948315][ T110] BUG: scheduling while atomic: kworker/0:2/110/0x00000002 (Got stuck here! Power cycle required to reset.)na.

To me this looks like a race condition where the end result depends on small timing differences. It may (or may not) be related to the flaw in the system clock. I did not check if the real time clock will reset the box when it's stuck, but it would come as a surprise.

After boot, 'fw_setenv' and 'fw_printenv' seems to work fine, but root privileges are required. '${mtdparts}' in the U-Boot environment does not seems to work. I found no way of avoiding the "$mtdparts" being reported twice by the kernel right after a reboot, but I don't think it does any harm.

Then I went on to test the same kernel with the new DTB. It failed with

[ 3.310045][ T1] mdio_bus d0072004.mdio-mii: MDIO device at address 1 is missing.

and powered off the the box! The serial console output is attached.

BTW, there seems to be a 32 character limit on image names. "initramfs-6.8.7-mvebu-370xp-tld-1" has 33 characters and is truncated by 'mkimage' on the box.

Regards,
Trond Melen

Reply Quote

Attachments:
open | download - serial.lst (21.8KB)

bodhi


June 06, 2024 02:47PM
Admin
Registered: 13 years ago
Posts: 18,712

Hi Trond,

> I've experimented a little. The BUG-line come and
> go. Sometimes there is no such line, sometimes it
> refers to "mdadm" and sometimes it refers to
> "kworker".

> To me this looks like a race condition where the
> end result depends on small timing differences. It
> may (or may not) be related to the flaw in the
> system clock. I did not check if the real time
> clock will reset the box when it's stuck, but it
> would come as a surprise.

I think you might be right about the timing difference. I ran the new kernel on the Mirabox for a day, and saw no BUG.

>
> After boot, 'fw_setenv' and 'fw_printenv' seems to
> work fine, but root privileges are required.
> '${mtdparts}' in the U-Boot environment does not
> seems to work. I found no way of avoiding the
> "$mtdparts" being reported twice by the kernel
> right after a reboot, but I don't think it does
> any harm.

So to confirm, if the bootargs is sane (no repeated mtdparts and other bad garbage), you don't see the BUG?

You could also remove '${mtdparts}' from the bootargs. It is only for accessing the u-boot envs in Debian. Just to get a clean bootarg would be good, to eliminate any unknown side effect.

> Then I went on to test the same kernel with the
> new> DTB. It failed with
>

> [ 3.310045][ T1] mdio_bus d0072004.mdio-mii:> MDIO device at address 1 is missing.>

Bummer, please go back to the current DTB in the kernel linux-6.8.7-mvebu-370xp-tld-1.

> BTW, there seems to be a 32 character limit on
> image names. "initramfs-6.8.7-mvebu-370xp-tld-1"
> has 33 characters and is truncated by 'mkimage' on
> the box.

That's right! I've noticed that too.Perhaps I should use a shorter naming convention :)

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)

Reply Quote

tme


June 07, 2024 01:54PM
Registered: 7 years ago
Posts: 165

Hi bodhi,

Quote

I think you might be right about the timing difference. I ran the new kernel on the Mirabox for a day, and saw no BUG.

When the line "BUG: scheduling while atomic" appears on the serial console (and later in the 'dmesg' output), it is always between 27 s and 30 s into the kernel boot. It is always followed by two debug stack dumps as in the attachment to this post as well as to my first post on this kernel version.

Quote

So to confirm, if the bootargs is sane (no repeated mtdparts and other bad garbage), you don't see the BUG?

The U-Boot environment looks clean. Still, 'dmesg' report the value of $mtdparts twice:

root@rn102:~# dmesg | grep mtdparts | fold -sw 80[ 0.000000] Kernel command line: console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) earlyprintk=serial mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) reason=normal bdtype=rn102

I assume that $mtdparts is also reported twice on the serial console during boot, even though I don't see it. I assume 'minicom' truncates long lines.

The first time I followed the boot of Linux kernel version 6.8.7 on the console, warnings were in bold. I liked it, but now it's gone. Is there a way to restore that behavior?

I just discovered some error messages from the kernel on the serial console at login. It's repeated the same at each following login:

stop_machine_cpuslocked+0x108/0x13c from patch_text+0x24/0x4cstop_machine_cpuslocked+0x108/0x13c from patch_text+0x24/0x4c

These messages are not present in the 'dmesg' output.

To conclude, I cannot currently recommend Linux kernel version 6.8.7 for Netgear RN102. The major problem is that the boot may freeze after ~30 s, requiring the box to be unplugged from power to recover. This happens in about one out of five reboots.

The output on the serial console after restoring the DTS is attached. (The U-Boot environment was printed before boot.) What do you think the debug stack traces and the kernel messages at login tell us?

Regards,
Trond Melen

Reply Quote

Attachments:
open | download - serial.lst (52.7KB)

bodhi


June 07, 2024 03:43PM
Admin
Registered: 13 years ago
Posts: 18,712

Hi Trond,

> The U-Boot environment looks clean. Still, 'dmesg'
> report the value of $mtdparts twice:

> I assume that $mtdparts is also reported twice on
> the serial console during boot, even though I
> don't see it. I assume 'minicom' truncates long
> lines.

>
> The first time I followed the boot of Linux kernel
> version 6.8.7 on the console, warnings were in
> bold. I liked it, but now it's gone. Is there a
> way to restore that behavior?

Not sure how.

> The output on the serial console after restoring
> the DTS is attached. (The U-Boot environment was
> printed before boot.) What do you think the debug
> stack traces and the kernel messages at login tell
> us?

I think it is a real BUG in the kernel. Unfortunately it shows up in our custom kernel and only in this RN102 box so it won't be well accepted if I send the bug report.

=======

I think I will reconfigure it back to non SMP in an interim kernel, e.g. linux-6.8.7-mvebu-370xp-tld-2. And then see if it will trigger the same bug (I don't think so, but we will see).

In the mean time, could you boot this 6.8.7 kernel with this change in the bootargs. Power up, and interrupt serial console,

setenv sata_set_bootargs 'setenv bootargs console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 earlyprintk=serial'boot

(I just don't like to see the kernel bootargs being messed up. Apparently, if you notice "reason=normal bdtype=rn102" looks like garbage that comes after the mtdparts env. This u-boot could have bug that when it resolved the $mtdparts it has read a longer string in memory).

Quote

mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@
0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) earlyprintk=serial
mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@
0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) reason=normal
bdtype=rn102

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)

Reply Quote

tme


June 07, 2024 08:52PM
Registered: 7 years ago
Posts: 165

Hi bodhi,

Quote

I just don't like to see the kernel bootargs being messed up.

I halted U-Boot, modified "sata_set_bootargs" and booted five times:

root@rn102:~# dmesg | grep mtdparts | fold -sw 80[ 0.000000] Kernel command line: console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 earlyprintk=serial mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) reason=normal bdtype=rn102root@rn102:~# dmesg | grep "scheduling while atomic"[ 29.788416] BUG: scheduling while atomic: kworker/0:2/110/0x00000002
[ 36.188439][ T110] BUG: scheduling while atomic: kworker/0:2/110/0x00000002
root@rn102:~# dmesg | grep mtdparts | fold -sw 80[ 0.000000] Kernel command line: console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 earlyprintk=serial mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) reason=normal bdtype=rn102root@rn102:~# dmesg | grep "scheduling while atomic"[ 30.348387] BUG: scheduling while atomic: kworker/0:2/110/0x00000002
root@rn102:~# dmesg | grep mtdparts | fold -sw 80[ 0.000000] Kernel command line: console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 earlyprintk=serial mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) reason=normal bdtype=rn102root@rn102:~# dmesg | grep "scheduling while atomic"[ 30.348387] BUG: scheduling while atomic: kworker/0:2/110/0x00000002
root@rn102:~# dmesg | grep mtdparts | fold -sw 80[ 0.000000] Kernel command line: console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 earlyprintk=serial mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) reason=normal bdtype=rn102root@rn102:~# dmesg | grep "scheduling while atomic"root@rn102:~#

Setting the U-Boot setup variable sata_set_bootargs to it's final value made the mtdparts being reported only once, but not much more was changed. The "reason=normal bdtype=rn102" part persists.

The "BUG: scheduling while atomic" appeared in all but the 5th boot. The 2nd boot brought the box to a complete halt, requiring disconnecting it from power to get going.

Quote
tme
When the line "BUG: scheduling while atomic" appears on the serial console (and later in the 'dmesg' output), it is always between 27 s and 30 s into the kernel boot.

Well, the interval must be extended to 27 s to 37 s to include the 2nd boot.

Regards,
Trond Melen

Reply Quote

bodhi


June 07, 2024 10:11PM
Admin
Registered: 13 years ago
Posts: 18,712

Hi Trond,

> I halted U-Boot, modified "sata_set_bootargs" and
> booted five times:

> root@rn102:~# dmesg | grep mtdparts | fold -sw 80

> [ 0.000000] Kernel command line:> console=ttyS0,115200 root=LABEL=rootfs > rootdelay=10 earlyprintk=serial > mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@> 0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)> reason=normal > bdtype=rn102

That's really strange. I need to look at the envs more closely.

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)

Reply Quote

bodhi


June 12, 2024 03:18PM
Admin
Registered: 13 years ago
Posts: 18,712

Hi Trond,

Have you seen this problem that Jugle is seeing with networkk receive buffer overrun?

https://forum.doozan.com/read.php?2,137625,137696#msg-137696

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)

Reply Quote

tme


June 13, 2024 02:18PM
Registered: 7 years ago
Posts: 165

Hi bodhi,

Quote

Have you seen this problem that Jugle is seeing with networkk receive buffer overrun?

No, I have not:

tme@rn102:~$ uptime 19:45:39 up 5 days, 5:10, 2 users, load average: 0.00, 0.00, 0.00tme@rn102:~$ dmesg | grep overruntme@rn102:~$

IIRC, RN104 has two ethenet ports, while RN102 has one. The SOC has two interfaces on both. Could it be that the overrun occurs on the one that is unused on RN102?

I've noticed that @jungle_roger got the "BUG: scheduling while atomic" message already after 10 seconds:

[ 10.428258] BUG: scheduling while atomic: kworker/0:0/8/0x00000002

Could it be that "/bin/systemd" exercises the kernel's scheduler more intensively then "/sbin/init"?

Regards,
Trond Melen

Reply Quote

bodhi


June 13, 2024 02:47PM
Admin
Registered: 13 years ago
Posts: 18,712

Hi Trond,

>

Quote

Have you seen this problem that Jugle is
> seeing with networkk receive buffer
> overrun?

>
> No, I have not:

Good to hear that!

>

> tme@rn102:~$ uptime> 19:45:39 up 5 days, 5:10, 2 users, load> average: 0.00, 0.00, 0.00> tme@rn102:~$ dmesg | grep overrun> tme@rn102:~$>

>
> IIRC, RN104 has two ethenet ports, while RN102 has
> one. The SOC has two interfaces on both. Could it
> be that the overrun occurs on the one that is
> unused on RN102?

No, the overruns are on eth0

Quote

[ 692.478184] mvneta d0070000.ethernet eth0: bad rx status 0f830000 (overrun error), size=448

>
> I've noticed that @jungle_roger got the "BUG:
> scheduling while atomic" message already after 10
> seconds:
>

> [ 10.428258] BUG: scheduling while atomic:> kworker/0:0/8/0x00000002>

>
> Could it be that "/bin/systemd" exercises the
> kernel's scheduler more intensively then
> "/sbin/init"?

At this time, looks like systemd is not running yet.

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)

Reply Quote

Page 5 of 5Pages:12345

Newer Topic Older Topic

Print View RSS

GTI Mirabox, Netgear RN102/RN104, Netgear RN2120 Installation & Kernel Upgrade (Linux-6.8.7) (2024)
Top Articles
Latest Posts
Article information

Author: Kerri Lueilwitz

Last Updated:

Views: 5888

Rating: 4.7 / 5 (67 voted)

Reviews: 82% of readers found this page helpful

Author information

Name: Kerri Lueilwitz

Birthday: 1992-10-31

Address: Suite 878 3699 Chantelle Roads, Colebury, NC 68599

Phone: +6111989609516

Job: Chief Farming Manager

Hobby: Mycology, Stone skipping, Dowsing, Whittling, Taxidermy, Sand art, Roller skating

Introduction: My name is Kerri Lueilwitz, I am a courageous, gentle, quaint, thankful, outstanding, brave, vast person who loves writing and wants to share my knowledge and understanding with you.