Welcome! Log In Create A New Profile
Home > Debian > Topic > Page 5
Posted by bodhi
Forum List Message List New Topic
bodhi March 02, 2024 03:02PM | Admin Registered: 13 years ago Posts: 18,712 |
Hi Trond,
> most likely because 'mkimage' on the 2nd box do
> not support 'MODULES=dep':
> So it seems 'mkimage' got a face lift in Debian
> Bookworm?
Probably not that reason. Because 'MODULES=dep" was in Debian 11.x
This is one of my home media NAS boxes, still running Debian 11.
root@HomeMedia1:~# cat /etc/debian_version 11.1root@HomeMedia1:~# grep -i modules /etc/initramfs-tools/initramfs.conf # MODULES: [ most | netboot | dep | list ]# dep - Try and guess which modules to load.# netboot - Add the base modules, network modules, but skip block devices.# list - Only include modules from the 'additional modules' listMODULES=most
Perhpas "dep" did not work before. I'll try a test to see if can recreate your observation.
-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)
Reply Quote
bodhi March 02, 2024 03:10PM | Admin Registered: 13 years ago Posts: 18,712 |
Hi Trond,
> Perhpas "dep" did not work before. I'll try a test
> to see if can recreate your observation.
I've verified that "MODULES=dep" works in Debian 11.
-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)
Reply Quote
bodhi March 03, 2024 02:57PM | Admin Registered: 13 years ago Posts: 18,712 |
Here is kernel linux-6.7.5-mvebu-370xp-tld-3. The objective is to:
- Configured as SMP kernel, to add support for Netgear RN2120 (Armada XP 2-cores)
- I2S and SPDIF still configured in the kernel (these were removed in linux-6.7.5-mvebu-370xp-tld-2).
- The DTB for Netgear RN120 has Audio Controller nodes removed. This is for testing the time lag issue in RN102 (see if it will help or hurt).
Download at Dropbox
linux-6.7.5-mvebu-370xp-tld-3-bodhi.tar.bz2
md5:a44771dd7d586d2171d9c68aeb19e96esha256:d9e82ba31cf2ddc123aa882bee0dd4c4c05ab5095eccc7546c6c8d33fb9f100c
This tarball contains 5 files
linux-image-6.7.5-mvebu-370xp-tld-3_3_armhf.deblinux-headers-6.7.5-mvebu-370xp-tld-3_3_armhf.debzImage-6.7.5-mvebu-370xp-tld-3config-6.7.5-mvebu-370xp-tld-3linux-dtb-6.7.5-mvebu-370xp-tld-3.tar
Please do a full kernel installation (see instruction for kernel linux-6.5.7-mvebu-370xp-tld-1 up top).
-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)
Reply Quote
tme March 03, 2024 03:55PM | Registered: 7 years ago Posts: 165 |
Hi bodhi,
Thank you very much for your efforts! I have successfully installed Linux kernel version 'linux-6.7.5-mvebu-370xp-tld-3' on a Readynas RN102. It works a expected.
To check the system clock, I disabled the 'crontab' job that updates the system clock from the hardware clock once a minute. I also made the RN102 trust my laptop so that 'ssh' works without a password. On the laptop I inquired the system clock twice, with a 5 minutes delay in between:
$ uname -aLinux rn102 6.7.5-mvebu-370xp-tld-3 #1 SMP PREEMPT Thu Feb 29 19:03:59 PST 2024 armv7l GNU/Linux$ cat /etc/debian_version 12.4$ ssh rn102 date -R && sleep 300 && ssh rn102 date -RSun, 03 Mar 2024 22:41:39 +0100Sun, 03 Mar 2024 22:46:14 +0100
So the rn102's system clock lagged 25 seconds in 5 minutes. That's 5 seconds per minute.
Regards,
Trond Melen
Reply Quote
bodhi March 03, 2024 04:18PM | Admin Registered: 13 years ago Posts: 18,712 |
Hi Trond,
6.7.5-mvebu-370xp-tld-3
> So the rn102's system clock lagged 25 seconds in 5
> minutes. That's 5 seconds per minute.
6.7.5-mvebu-370xp-tld-2
> It used to be lagging 6 to 7 seconds
> per minute. Now it's lagging just above 5 seconds
> per minute.
So it proved that the kernel configs for I2S and SPIF can remain, we just need to remove the audio controller nodes in the DTS to see the difference.
And also perhaps later you can tell if SMP kernel performance is about the same as non-SMP in some tasks. Or if the difference is insignificant. The reason I used non-SMP was from danitool's testing on OpenWrt as credited in the 1st post (there are some other tweaks for Armada 370 SoC by danitool that I have not looked into)
But SMP on a single CPU should only incur some very small overhead, IMO.
-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)
Reply Quote
bodhi March 13, 2024 06:23PM | Admin Registered: 13 years ago Posts: 18,712 |
Kernel linux-6.7.5-mvebu-370xp-tld-3 package has been uploaded. See 1st post for download link.
Note that this is the same as the working kernel I've uploaded before in this thread.
-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)
Edited 1 time(s). Last edit at 03/13/2024 06:24PM by bodhi.
Reply Quote
bodhi March 19, 2024 05:30PM | Admin Registered: 13 years ago Posts: 18,712 |
pczerepaniak & Trond,
I've tried WOL on my Mirabox and it did not work either.
Note that in /etc/init.d/halt the NETDOWN is not clear. So the halt action would shutdown the network.
halt -d -f $netdown $poweroff $hddown
So I explicitly modified this initscript to do
halt -d -f $poweroff $hddown
Even with the above change, WOL does nothing.
Translate all of the above to systemd scripts if you are running it.
-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)
Reply Quote
bodhi May 21, 2024 02:47PM | Admin Registered: 13 years ago Posts: 18,712 |
Kernel linux-6.8.7-mvebu-370xp-tld-1-bodhi.tar.bz2 package has been uploaded. Please see 1st post for download link.
Hi Trond,
I did not implement any patch to improve the timing problem on the RN102/104 in this kernel. I'm thinking we should try to modify the DTS first, so I will post the working DTB here later.
-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)
Reply Quote
bodhi May 22, 2024 01:54PM | Admin Registered: 13 years ago Posts: 18,712 |
Hi Trond,
The attached DTBs have the watchdog timer removed. The RN102 also has audio controller and SPDIF removed. Please try the timing test to see if there is any difference.
-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)
Reply Quote
Attachments:
open | download - armada-370-netgear-rn102.dtb (14.8KB)
open | download - armada-370-netgear-rn104.dtb (15.2KB)
tme May 27, 2024 11:59AM | Registered: 7 years ago Posts: 165 |
Thanks bodhi,
for continuing supporting the Netgear ReadyNAS! Following the procedure in the 1st post, I upgraded one of my RN102 boxes to Linux kernel version 6.8.7. The box booted fine and I could log in using 'ssh' and copy files using 'scp', so the new kernel basically works.
Examining the 'dmesg' output (attached), I discovered two debug traces, though, after this line:
[ 28.050207] BUG: scheduling while atomic: mdadm/1710/0x00000002
'mdadm' indicates an issue with 'raid', but the '/home' directory mounted OK:
$ lsblkNAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTSsda 8:0 1 1.8T 0 disk `-sda1 8:1 1 1.8T 0 part `-md0 9:0 0 1.8T 0 raid1 /homesdb 8:16 1 1.8T 0 disk `-sdb1 8:17 1 1.8T 0 part `-md0 9:0 0 1.8T 0 raid1 /homesdc 8:32 0 149.1G 0 disk `-sdc1 8:33 0 149G 0 part /mtdblock0 31:0 0 1.5M 0 disk mtdblock1 31:1 0 128K 1 disk mtdblock2 31:2 0 6M 0 disk mtdblock3 31:3 0 4M 0 disk mtdblock4 31:4 0 116M 0 disk $
The issue repeated on a reboot. I assume we should try to sort out the reported BUG before experimenting with a new DTB?
Regards,
Trond Melen
Reply Quote
Attachments:
open | download - dmesg.lst (39.5KB)
bodhi May 27, 2024 04:13PM | Admin Registered: 13 years ago Posts: 18,712 |
Hi Trond,
Not sure about this kernel bug (I'm not using RAID on Mirabox, obviously)!
There is an error in your bootargs:
Quote
[ 0.000000] Kernel command line: console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) earlyprintk=serial mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) reason=normal bdtype=rn102
[ 0.000000] Unknown kernel command line parameters "reason=normal bdtype=rn102", will be passed to user space.
Notice that the parameters are repeated. Sometime a badly formatted bootargs could affect rootfs mounting. However, the unknown param like "reason=normal bdtype=rn102" are OK, i.e. ignored by the kernel.
Perhaps you should boot with serial console and fix the bootargs first, in case it is causing a side effect.
-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)
Reply Quote
tme May 28, 2024 02:03AM | Registered: 7 years ago Posts: 165 |
Hi bodhi,
Where does the extra environment variables come from? As I understand, these are the only environment variables that are actually used by 'U-Boot' when booting:
$ sudo fw_printenv | egrep -i 'bootcmd=|sata_set_bootargs=|sata_boot=|mtdparts=|load_uimage=|load_uinitrd=' | sortbootcmd=ide reset; run sata_bootcmd; resetload_uimage=ext2load ide 0:1 0x2000000 /boot/uImageload_uinitrd=ext2load ide 0:1 0x3000000 /boot/uInitrdmtdparts=mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)sata_boot=run load_uimage; if run load_uinitrd; then bootm 0x2000000 0x3000000; else bootm 0x2000000; fisata_bootcmd=run sata_set_bootargs; run sata_bootsata_set_bootargs=setenv bootargs console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 $mtdparts earlyprintk=serial$
And wheat about those passed to the user environment?
$ sudo fw_printenv | egrep -i 'reason|normal|bdtype|rn102'SKUNum=RN102Startup=Normal
Are they hard coded into the stock 'U-Boot'? Attached the full output of 'fw_printenv'.
Regards,
Trond Melen
Reply Quote
Attachments:
open | download - fw_printenv.lst (2.4KB)
bodhi May 28, 2024 02:26PM | Admin Registered: 13 years ago Posts: 18,712 |
Hi Trond,
> Where does the extra environment variables come
> from? As I understand, these are the only
> environment variables that are actually used by
> 'U-Boot' when booting:
sata_bootcmd=run sata_set_bootargs; run sata_bootsata_set_bootargs=setenv bootargs console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 $mtdparts earlyprintk=serial
The above look OK.
> And wheat about those passed to the user
> environment?
>
> $ sudo fw_printenv | egrep -i> 'reason|normal|bdtype|rn102'> SKUNum=RN102> Startup=Normal>
>
> Are they hard coded into the stock 'U-Boot'?
> Attached the full output of 'fw_printenv'.
I think this must be the case where stock u-boot is buggy, and has problem parsing the mtdpart so it appends these extra bootargs. The result is a wrong mtdparts being passed into the kernel. So the current MTDs in your system is from the DTB definition. Not from the bootargs.
[ 3.184866] 5 fixed-partitions partitions found on MTD device pxa3xx_nand-0[ 3.191969] Creating 5 MTD partitions on "pxa3xx_nand-0":[ 3.197597] 0x000000000000-0x000000180000 : "u-boot"[ 3.206874] 0x000000180000-0x0000001a0000 : "u-boot-env"[ 3.215802] 0x000000200000-0x000000800000 : "uImage"[ 3.225242] 0x000000800000-0x000000c00000 : "minirootfs"[ 3.234744] 0x000000c00000-0x000008000000 : "ubifs"
The mtdparts should be corrected as:
mtdparts=mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)
But currently you cannot set env in Debian using fw_setenv, because the envs are read-only in the DTB definition.
So it must be set in serial console as followed:
setenv mtdparts 'mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)'
======
Later, I'll update the DTS to remove the read-only property for this RN102 and RN104. If we cannot set them then they are not much use in Debian.
-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)
Reply Quote
tme May 28, 2024 04:57PM | Registered: 7 years ago Posts: 165 |
Hi bodhi,
Thanks for your advice! The BUG message in the 'dmesg' output is now gone. Attached output on the serial console.
There are still some warnings printed in bold on the serial console:
[ 1.990632] gpio gpiochip0: Static allocation of GPIO base is deprecated, use dynamic allocation.[ 2.000817] debugfs: Directory 'd0018100.gpio' with parent 'regmap' already present![ 2.010570] gpio gpiochip1: Static allocation of GPIO base is deprecated, use dynamic allocation.[ 2.020570] debugfs: Directory 'd0018140.gpio' with parent 'regmap' already present![ 2.029418] gpio gpiochip2: Static allocation of GPIO base is deprecated, use dynamic allocation.[ 6.393683] mtdblock: MTD device 'u-boot' is NAND, please consider using UBI block devices instead.[ 6.406327] mtdblock: MTD device 'uImage' is NAND, please consider using UBI block devices instead.[ 6.420679] mtdblock: MTD device 'u-boot-env' is NAND, please consider using UBI block devices instead.[ 6.462804] mtdblock: MTD device 'minirootfs' is NAND, please consider using UBI block devices instead.[ 6.491866] mtdblock: MTD device 'ubifs' is NAND, please consider using UBI block devices instead.[ 23.088713] mtdblock: MTD device 'u-boot' is NAND, please consider using UBI block devices instead.[ 23.317898] mtdblock: MTD device 'u-boot-env' is NAND, please consider using UBI block devices instead.[ 23.404877] mtdblock: MTD device 'uImage' is NAND, please consider using UBI block devices instead.[ 23.484129] mtdblock: MTD device 'minirootfs' is NAND, please consider using UBI block devices instead.[ 23.590980] mtdblock: MTD device 'ubifs' is NAND, please consider using UBI block devices instead.
The setup variable 'mtdparts' is still reported differently by 'printenv' on the console
Marvell>> printenv mtdparts mtdparts=mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)
and 'dmesg' by 'ssh':
[ 0.000000] Kernel command line: console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) earlyprintk=serial mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) reason=normal bdtype=rn102
The setup variables passed to the user environment persists.
Regards,
Trond Melen
Reply Quote
Attachments:
open | download - serial.lst (38.9KB)
open | download - dmesg.lst (29.2KB)
bodhi May 28, 2024 06:37PM | Admin Registered: 13 years ago Posts: 18,712 |
Hi Trond,
> Thanks for your advice! The BUG message in the
> 'dmesg' output is now gone. Attached output on the
> serial console.
Cool! It must have been some side effects that we could not see.
>
> There are still some warnings printed in bold on
> the serial console:
> [ 1.990632] gpio gpiochip0: Static allocation
> of GPIO base is deprecated, use dynamic
> allocation.
> [ 2.000817] debugfs: Directory 'd0018100.gpio'
> with parent 'regmap' already present!
> [ 2.010570] gpio gpiochip1: Static allocation
> of GPIO base is deprecated, use dynamic
> allocation.
> [ 2.020570] debugfs: Directory 'd0018140.gpio'
> with parent 'regmap' already present!
> [ 2.029418] gpio gpiochip2: Static allocation
> of GPIO base is deprecated, use dynamic
> allocation.
The above warnings are OK. What we have in the current DTS (from mainline) are the old GPIO Static allocation. They just want to remind people to update. These are warning only, it's OK for now. I'll look at this when building next MVEBU kernel version.
> [ 6.393683] mtdblock: MTD device 'u-boot' is
> NAND, please consider using UBI block devices
> instead.
> [ 6.406327] mtdblock: MTD device 'uImage' is
> NAND, please consider using UBI block devices
> instead.
> [ 6.420679] mtdblock: MTD device 'u-boot-env'
> is NAND, please consider using UBI block devices
> instead.
> [ 6.462804] mtdblock: MTD device 'minirootfs'
> is NAND, please consider using UBI block devices
> instead.
> [ 6.491866] mtdblock: MTD device 'ubifs' is
> NAND, please consider using UBI block devices
> instead.
> [ 23.088713] mtdblock: MTD device 'u-boot' is
> NAND, please consider using UBI block devices
> instead.
> [ 23.317898] mtdblock: MTD device 'u-boot-env'
> is NAND, please consider using UBI block devices
> instead.
> [ 23.404877] mtdblock: MTD device 'uImage' is
> NAND, please consider using UBI block devices
> instead.
> [ 23.484129] mtdblock: MTD device 'minirootfs'
> is NAND, please consider using UBI block devices
> instead.
> [ 23.590980] mtdblock: MTD device 'ubifs' is
> NAND, please consider using UBI block devices
> instead.
> [/code]
Same kind of warnings. Stock MTD layout have raw NAND (i.e. to store Firmware). When we use mtd block device, the warnings are printed out to tell people to use UBI (as it should). A typical kernel will not print this because mtd block device module are not running immediately when the kernel booting. But our kernel does have this builtin.
I was going to move it out to initramfs to make the kernel log less noisy and shrink uImage but forgot about it!
> The setup variable 'mtdparts' is still reported
> differently by 'printenv' on the console
> and 'dmesg' by 'ssh':
>
> [ 0.000000] Kernel command line:> console=ttyS0,115200 root=LABEL=rootfs> rootdelay=10> mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)> earlyprintk=serial> mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)> reason=normal bdtype=rn102>
>
> The setup variables passed to the user environment
> persists.
This might be related to the syntax we use to set up the envs. Stock u-boot could be buggy in dereferencing the $mtdparts inside another setenv.
Try using single quote, and { brackets as below
setenv mtdparts 'mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)'setenv sata_set_bootargs 'setenv bootargs console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 ${mtdparts} earlyprintk=serial'
-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)
Edited 1 time(s). Last edit at 05/28/2024 06:38PM by bodhi.
Reply Quote
tme May 29, 2024 01:45AM | Registered: 7 years ago Posts: 165 |
Hi bodhi,
The full serial console is attached. This is the U-Boot dialog:
Marvell>> printenv mtdparts sata_set_bootargs mtdparts=mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)sata_set_bootargs=setenv bootargs console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 $mtdparts earlyprintk=serialMarvell>> setenv mtdparts 'mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)'Marvell>> setenv sata_set_bootargs 'setenv bootargs console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 ${mtdparts} earlyprintk=serial'Marvell>> printenv mtdparts sata_set_bootargs mtdparts=mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)sata_set_bootargs=setenv bootargs console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 ${mtdparts} earlyprintk=serial
The BUG message is back:
[ 29.948156][ T110] BUG: scheduling while atomic: kworker/0:2/110/0x00000002
Should the 'mtdparts' setup parameter include "mtdparts=pxa3xx_nand-0" or "mtdparts=armada-nand", or maybe it dosn't matter?
Regards,
Trond Melen
Reply Quote
Attachments:
open | download - serial.lst (52.5KB)
bodhi May 30, 2024 02:17PM | Admin Registered: 13 years ago Posts: 18,712 |
Hi Trond,
Quote
Marvell>> printenv mtdparts sata_set_bootargs mtdparts=mtdparts=armada-nand:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)> The BUG message is back:
>> [ 29.948156][ T110] BUG: scheduling while> atomic: kworker/0:2/110/0x00000002>
The mtdparts env was not set correctly, so you're actually back to the previous boot setup.
So set it in serial console
setenv mtdparts 'mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)'setenv sata_set_bootargs 'setenv bootargs console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 ${mtdparts} earlyprintk=serial'
and then save it (for this boot, it is not necessary to save it, but for the next boot so it can be booted withtout interruption).
saveenvboot
> Should the 'mtdparts' setup parameter include
> "mtdparts=pxa3xx_nand-0" or
> "mtdparts=armada-nand", or maybe it dosn't matter?
Yes, it must be pxa3xx_nand-0 for the NAND driver to be found and run in the kernel.
If setting to the correct NAND driver pxa3xx_nand and the the kernel BUG does not show up, then I think armada_nand as a non-existent driver might have something to do with it.
-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)
Reply Quote
tme June 06, 2024 02:02PM | Registered: 7 years ago Posts: 165 |
Hi bodhi,
I'm sorry for this slow response! And sorry for having forgotten that modifications to the U-Boot environment are volatile until saved.
I've experimented a little. The BUG-line come and go. Sometimes there is no such line, sometimes it refers to "mdadm" and sometimes it refers to "kworker". These lines are from 10 reboots with no modifications to the U-Boot environment in between:
[ 29.788241] BUG: scheduling while atomic: kworker/0:2/110/0x00000002na.[ 30.028215] BUG: scheduling while atomic: kworker/0:2/110/0x00000002[ 27.944406] BUG: scheduling while atomic: mdadm/1710/0x00000002[ 27.628423][ T110] BUG: scheduling while atomic: kworker/0:2/110/0x00000002 (Got stuck here! Power cycle required to reset.)na.[ 29.948205][ T8] BUG: scheduling while atomic: kworker/0:0/8/0x00000002[ 29.868228][ T110] BUG: scheduling while atomic: kworker/0:2/110/0x00000002[ 29.948315][ T110] BUG: scheduling while atomic: kworker/0:2/110/0x00000002 (Got stuck here! Power cycle required to reset.)na.
To me this looks like a race condition where the end result depends on small timing differences. It may (or may not) be related to the flaw in the system clock. I did not check if the real time clock will reset the box when it's stuck, but it would come as a surprise.
After boot, 'fw_setenv' and 'fw_printenv' seems to work fine, but root privileges are required. '${mtdparts}' in the U-Boot environment does not seems to work. I found no way of avoiding the "$mtdparts" being reported twice by the kernel right after a reboot, but I don't think it does any harm.
Then I went on to test the same kernel with the new DTB. It failed with
[ 3.310045][ T1] mdio_bus d0072004.mdio-mii: MDIO device at address 1 is missing.
and powered off the the box! The serial console output is attached.
BTW, there seems to be a 32 character limit on image names. "initramfs-6.8.7-mvebu-370xp-tld-1" has 33 characters and is truncated by 'mkimage' on the box.
Regards,
Trond Melen
Reply Quote
Attachments:
open | download - serial.lst (21.8KB)
bodhi June 06, 2024 02:47PM | Admin Registered: 13 years ago Posts: 18,712 |
Hi Trond,
> I've experimented a little. The BUG-line come and
> go. Sometimes there is no such line, sometimes it
> refers to "mdadm" and sometimes it refers to
> "kworker".
> To me this looks like a race condition where the
> end result depends on small timing differences. It
> may (or may not) be related to the flaw in the
> system clock. I did not check if the real time
> clock will reset the box when it's stuck, but it
> would come as a surprise.
I think you might be right about the timing difference. I ran the new kernel on the Mirabox for a day, and saw no BUG.
>
> After boot, 'fw_setenv' and 'fw_printenv' seems to
> work fine, but root privileges are required.
> '${mtdparts}' in the U-Boot environment does not
> seems to work. I found no way of avoiding the
> "$mtdparts" being reported twice by the kernel
> right after a reboot, but I don't think it does
> any harm.
So to confirm, if the bootargs is sane (no repeated mtdparts and other bad garbage), you don't see the BUG?
You could also remove '${mtdparts}' from the bootargs. It is only for accessing the u-boot envs in Debian. Just to get a clean bootarg would be good, to eliminate any unknown side effect.
> Then I went on to test the same kernel with the
> new> DTB. It failed with
>
> [ 3.310045][ T1] mdio_bus d0072004.mdio-mii:> MDIO device at address 1 is missing.>
Bummer, please go back to the current DTB in the kernel linux-6.8.7-mvebu-370xp-tld-1.
> BTW, there seems to be a 32 character limit on
> image names. "initramfs-6.8.7-mvebu-370xp-tld-1"
> has 33 characters and is truncated by 'mkimage' on
> the box.
That's right! I've noticed that too.Perhaps I should use a shorter naming convention :)
-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)
Reply Quote
tme June 07, 2024 01:54PM | Registered: 7 years ago Posts: 165 |
Hi bodhi,
Quote
I think you might be right about the timing difference. I ran the new kernel on the Mirabox for a day, and saw no BUG.
When the line "BUG: scheduling while atomic" appears on the serial console (and later in the 'dmesg' output), it is always between 27 s and 30 s into the kernel boot. It is always followed by two debug stack dumps as in the attachment to this post as well as to my first post on this kernel version.
Quote
So to confirm, if the bootargs is sane (no repeated mtdparts and other bad garbage), you don't see the BUG?
The U-Boot environment looks clean. Still, 'dmesg' report the value of $mtdparts twice:
root@rn102:~# dmesg | grep mtdparts | fold -sw 80[ 0.000000] Kernel command line: console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) earlyprintk=serial mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) reason=normal bdtype=rn102
I assume that $mtdparts is also reported twice on the serial console during boot, even though I don't see it. I assume 'minicom' truncates long lines.
The first time I followed the boot of Linux kernel version 6.8.7 on the console, warnings were in bold. I liked it, but now it's gone. Is there a way to restore that behavior?
I just discovered some error messages from the kernel on the serial console at login. It's repeated the same at each following login:
stop_machine_cpuslocked+0x108/0x13c from patch_text+0x24/0x4cstop_machine_cpuslocked+0x108/0x13c from patch_text+0x24/0x4c
These messages are not present in the 'dmesg' output.
To conclude, I cannot currently recommend Linux kernel version 6.8.7 for Netgear RN102. The major problem is that the boot may freeze after ~30 s, requiring the box to be unplugged from power to recover. This happens in about one out of five reboots.
The output on the serial console after restoring the DTS is attached. (The U-Boot environment was printed before boot.) What do you think the debug stack traces and the kernel messages at login tell us?
Regards,
Trond Melen
Reply Quote
Attachments:
open | download - serial.lst (52.7KB)
bodhi June 07, 2024 03:43PM | Admin Registered: 13 years ago Posts: 18,712 |
Hi Trond,
> The U-Boot environment looks clean. Still, 'dmesg'
> report the value of $mtdparts twice:
> I assume that $mtdparts is also reported twice on
> the serial console during boot, even though I
> don't see it. I assume 'minicom' truncates long
> lines.
>
> The first time I followed the boot of Linux kernel
> version 6.8.7 on the console, warnings were in
> bold. I liked it, but now it's gone. Is there a
> way to restore that behavior?
Not sure how.
> The output on the serial console after restoring
> the DTS is attached. (The U-Boot environment was
> printed before boot.) What do you think the debug
> stack traces and the kernel messages at login tell
> us?
I think it is a real BUG in the kernel. Unfortunately it shows up in our custom kernel and only in this RN102 box so it won't be well accepted if I send the bug report.
=======
I think I will reconfigure it back to non SMP in an interim kernel, e.g. linux-6.8.7-mvebu-370xp-tld-2. And then see if it will trigger the same bug (I don't think so, but we will see).
In the mean time, could you boot this 6.8.7 kernel with this change in the bootargs. Power up, and interrupt serial console,
setenv sata_set_bootargs 'setenv bootargs console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 earlyprintk=serial'boot
(I just don't like to see the kernel bootargs being messed up. Apparently, if you notice "reason=normal bdtype=rn102" looks like garbage that comes after the mtdparts env. This u-boot could have bug that when it resolved the $mtdparts it has read a longer string in memory).
Quote
mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@
0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) earlyprintk=serial
mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@
0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) reason=normal
bdtype=rn102
-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)
Reply Quote
tme June 07, 2024 08:52PM | Registered: 7 years ago Posts: 165 |
Hi bodhi,
Quote
I just don't like to see the kernel bootargs being messed up.
I halted U-Boot, modified "sata_set_bootargs" and booted five times:
root@rn102:~# dmesg | grep mtdparts | fold -sw 80[ 0.000000] Kernel command line: console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 earlyprintk=serial mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) reason=normal bdtype=rn102root@rn102:~# dmesg | grep "scheduling while atomic"[ 29.788416] BUG: scheduling while atomic: kworker/0:2/110/0x00000002
[ 36.188439][ T110] BUG: scheduling while atomic: kworker/0:2/110/0x00000002
root@rn102:~# dmesg | grep mtdparts | fold -sw 80[ 0.000000] Kernel command line: console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 earlyprintk=serial mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) reason=normal bdtype=rn102root@rn102:~# dmesg | grep "scheduling while atomic"[ 30.348387] BUG: scheduling while atomic: kworker/0:2/110/0x00000002
root@rn102:~# dmesg | grep mtdparts | fold -sw 80[ 0.000000] Kernel command line: console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 earlyprintk=serial mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) reason=normal bdtype=rn102root@rn102:~# dmesg | grep "scheduling while atomic"[ 30.348387] BUG: scheduling while atomic: kworker/0:2/110/0x00000002
root@rn102:~# dmesg | grep mtdparts | fold -sw 80[ 0.000000] Kernel command line: console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 earlyprintk=serial mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs) reason=normal bdtype=rn102root@rn102:~# dmesg | grep "scheduling while atomic"root@rn102:~#
Setting the U-Boot setup variable sata_set_bootargs to it's final value made the mtdparts being reported only once, but not much more was changed. The "reason=normal bdtype=rn102" part persists.
The "BUG: scheduling while atomic" appeared in all but the 5th boot. The 2nd boot brought the box to a complete halt, requiring disconnecting it from power to get going.
Quote
tme
When the line "BUG: scheduling while atomic" appears on the serial console (and later in the 'dmesg' output), it is always between 27 s and 30 s into the kernel boot.
Well, the interval must be extended to 27 s to 37 s to include the 2nd boot.
Regards,
Trond Melen
Reply Quote
bodhi June 07, 2024 10:11PM | Admin Registered: 13 years ago Posts: 18,712 |
Hi Trond,
> I halted U-Boot, modified "sata_set_bootargs" and
> booted five times:
> root@rn102:~# dmesg | grep mtdparts | fold -sw 80
> [ 0.000000] Kernel command line:> console=ttyS0,115200 root=LABEL=rootfs > rootdelay=10 earlyprintk=serial > mtdparts=pxa3xx_nand-0:0x180000@0(u-boot),0x20000@0x180000(u-boot-env),0x600000@> 0x200000(uImage),0x400000@0x800000(minirootfs),-(ubifs)> reason=normal > bdtype=rn102
That's really strange. I need to look at the envs more closely.
-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)
Reply Quote
bodhi June 12, 2024 03:18PM | Admin Registered: 13 years ago Posts: 18,712 |
Hi Trond,
Have you seen this problem that Jugle is seeing with networkk receive buffer overrun?
https://forum.doozan.com/read.php?2,137625,137696#msg-137696
-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)
Reply Quote
tme June 13, 2024 02:18PM | Registered: 7 years ago Posts: 165 |
Hi bodhi,
Quote
Have you seen this problem that Jugle is seeing with networkk receive buffer overrun?
No, I have not:
tme@rn102:~$ uptime 19:45:39 up 5 days, 5:10, 2 users, load average: 0.00, 0.00, 0.00tme@rn102:~$ dmesg | grep overruntme@rn102:~$
IIRC, RN104 has two ethenet ports, while RN102 has one. The SOC has two interfaces on both. Could it be that the overrun occurs on the one that is unused on RN102?
I've noticed that @jungle_roger got the "BUG: scheduling while atomic" message already after 10 seconds:
[ 10.428258] BUG: scheduling while atomic: kworker/0:0/8/0x00000002
Could it be that "/bin/systemd" exercises the kernel's scheduler more intensively then "/sbin/init"?
Regards,
Trond Melen
Reply Quote
bodhi June 13, 2024 02:47PM | Admin Registered: 13 years ago Posts: 18,712 |
Hi Trond,
>
Quote
Have you seen this problem that Jugle is
> seeing with networkk receive buffer
> overrun?
>
> No, I have not:
Good to hear that!
>
> tme@rn102:~$ uptime> 19:45:39 up 5 days, 5:10, 2 users, load> average: 0.00, 0.00, 0.00> tme@rn102:~$ dmesg | grep overrun> tme@rn102:~$>
>
> IIRC, RN104 has two ethenet ports, while RN102 has
> one. The SOC has two interfaces on both. Could it
> be that the overrun occurs on the one that is
> unused on RN102?
No, the overruns are on eth0
Quote
[ 692.478184] mvneta d0070000.ethernet eth0: bad rx status 0f830000 (overrun error), size=448
>
> I've noticed that @jungle_roger got the "BUG:
> scheduling while atomic" message already after 10
> seconds:
>
> [ 10.428258] BUG: scheduling while atomic:> kworker/0:0/8/0x00000002>
>
> Could it be that "/bin/systemd" exercises the
> kernel's scheduler more intensively then
> "/sbin/init"?
At this time, looks like systemd is not running yet.
-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)
Reply Quote
Page 5 of 5Pages:12345
Newer Topic Older Topic
Print View RSS