Compaq EVO N610c bugs


Strictly speaking these are not really bugs, these are/were just some things bugging me. I managed to solve most of them, at least up to a level that I'm happy with. Here's the list:

1) Back-light control - Solved
    There is no back-light control in the linux kernel for this computer. No matter what kernel choices I chose nothing worked.

2) SPD memory - Solved
    RAM DIMMs have a small eeprom chip with details regarding how the memory controller is to be configured to work with the RAM. Usual internet knowledge just tells you to run the decode-dimms command. Nope, not working here buddy.

3) System EEPROM - Still a bugger
    On page 22 of the schematics there is an eight legged creature labelled "System EEPROM". What the frack are you and what are you doing there?

4) There is a debugger in my BIOS and I like catching bugs. How do I use it?? - Solved

5) To spin or not to spin, that is the question - Solved
    I was always puzzled by the threshold temperatures chosen by Compaq to spin the fan. Could I do something about it?

(You'll need to read the Compaq N610c page first before continuing.)

Back-light control

Part 1)

So, what is the motivation? Believe it or not, the screen is just too bright to work at night in a very dimly lit room. I understand that a bright screen is useful when you're working in an office, by the window, in a sunny country, overlooking the beach (dream on...) so we cannot fault AUO for doing this. On the contrary, a big thanks to you AUO. 
(By the way, there are some great animations on their site.)

Indeed, I know that it is possible to just type:
$ xrandr --output LVDS --brightness 0.5
at the prompt to reduce the brightness to half, but this feels like cheating because all that this command is doing is making all the pixels darker, i.e. you have a bright light behind and get the liquid crystal to hide it. The cheat can also be seen in the mouse cursor, which remains bright.

So, I knew that removing the charger changes the "proper" brightness and what this means is that:
- The LM2601 adapter interface chip senses the power change and changes the ACPRES_3_IO signal

- That signal ends up in the Super-IO controller's pin 155, aka IN7
(The trick here is to know that connector CN1002 is the other side of connector CN501)

- The Super-IO is configured to react to changes on this pin and its interrupt routines then raise a SCI to the Pentium 4 and operating system

Now, a SCI is a System Control Interrupt and is described in the ACPI specifications this way (quoting from the specs):
"A system interrupt used by hardware to notify the OS of ACPI events."

So, what does the almighty internet have to say about all of this?
The theory goes that when such an interrupt is received, the kernel will look for the device that generated the interrupt, get the event number, look in the ACPI tables that the BIOS made available for the code and that has instructions regarding what to do and then execute them.

Could I then see all this process in action?
Well, I ended up going to the ACPICA downloads page, downloading the code and following the instructions to make the package and generate the utilities.
Then all that was left was running them in the right order

ACPI dump commands

The process above dumps the ACPI tables that the linux kernel is aware of to a file, then separates that file into the 4 component tables and disassembles them to four .dsl files with source code in the ASL language.
Bar variable names, these files are copies of the source code that compaq compiled, placed in the bios and which regulate the ACPI functionality of the laptop. So, we have a bit more than 200K of code to go through.

Blog posts like this and this are really good but, did I really need to learn a new programming language just to change the back-light? As this looked tedious and time consuming I tried another way, made a mental note to return to this subject later and carried on.

Part 2)

There is a small note in the LCD datasheet like this:

The inverter is the circuit needed by the system to generate the high voltages required by the CCFL (Cold Cathode Fluorescent Light) i.e. the back-light. The note implies that the brightness may be controlled by PWM.

PWM stands for Pulse Width Modulation. In the most common use, it refers to "square signal" waves or clocks where the time the signal is on may be different from the time it is off. There is usually a percentage value associated called duty cycle.

On page 18 of the schematic diagram we find the LCD connector and there is, indeed, a line there called INV_PWM_3

Since it goes to page 21 then it must connect to the Super-IO chip...

... and so it does, to pin 201 aka OUT9.
This pin can be configured to be the result of the PWM2 Function. (Datasheet typo bonus, can you spot it?)

The whole of Chapter 20 in the datasheet is dedicated to the programmable Pulse Width Modulators.
It's just 3 pages, mostly tables and the most relevant information is here.

Note that this is also a Mailbox indexed Register and that I had encountered these before, when I read the eeprom of the 8051.

So I made another one of my extremely stupid programs, a proof of concept that "just works".

#include <stdio.h>
#include <sys/io.h>
#include <sys/types.h>

#include <unistd.h>
#include <fcntl.h>

u_int8_t tmp[8];

int print_addr(u_int8_t log_device) {

  outb_p(0x07,0x2E);  outb_p(log_device,0x2F);

  outb_p(0x60,0x2E); tmp[3]=inb(0x2F); // Register 0x60
  outb_p(0x61,0x2E); tmp[4]=inb(0x2F); // Register 0x61

  printf("Log Dev %d address = %04X\n",log_device, (tmp[3] << 8) | tmp[4]);
  return (tmp[3] << 8) | tmp[4];

int main () {
  unsigned int x, y;

  outb_p(0x55,0x2E);   // Enter configuration mode
  outb_p(0x26,0x2E);  tmp[1]=inb(0x2F);
  outb_p(0x27,0x2E);  tmp[2]=inb(0x2F); 
  printf("\nLPC Address %04X\n\n",(tmp[2] << 8) | tmp[1]);

  printf("Mailbox  ");x=print_addr(9);

  printf("Changed to 90%\n"); y=0x80|(( 56 )<<1);;
  outb_p(0x95,x); outb_p(y,x+1);
  printf("Changed to 80%\n"); y=0x80|(( 50 )<<1);;
  outb_p(0x95,x); outb_p(y,x+1);
  printf("Changed to 70%\n"); y=0x80|(( 44 )<<1);;
  outb_p(0x95,x); outb_p(y,x+1);
  printf("Changed to 60%\n"); y=0x80|(( 37 )<<1);;
  outb_p(0x95,x); outb_p(y,x+1);
  printf("Changed to 50%\n"); y=0x80|(( 31 )<<1);;
  outb_p(0x95,x); outb_p(y,x+1);
  printf("Changed to 40%\n"); y=0x80|(( 25 )<<1);;
  outb_p(0x95,x); outb_p(y,x+1);
  printf("Changed to 30%\n"); y=0x80|(( 19 )<<1);;
  outb_p(0x95,x); outb_p(y,x+1);
  printf("Changed to 20%\n"); y=0x80|(( 13 )<<1);;
  outb_p(0x95,x); outb_p(y,x+1);
  printf("Changed to 10%\n"); y=0x80|((  6 )<<1);;
  outb_p(0x95,x); outb_p(y,x+1);
  printf("Changed to 00%\n"); y=0x80|((  0 )<<1);;
  outb_p(0x95,x); outb_p(y,x+1);

  printf("Back to 100%\n");
  outb_p(0x95,x); outb_p(0xfe,x+1);

  // Exit Configuration mode


SPD Memory

In my quest to learn more about this laptop there came a time when I decided to know some more about the RAM memory. What RAM did I really have, what were the specs, etc, etc...
In all it's cleverness, all that the internet would tell me was to use the command 'decode-dimms' from the i2c-tools
package. Obviously, it did not work...

Then, the internet blurted... You're an idiot, you need to have the eeprom driver in your kernel.
And so I did, I compiled my kernel with this driver. But 'decode-dimms' was still not working.
The man page for 'decode-dimms' states it is a tool "to decode the information found in memory module SPD EEPROMs."

But, what is SPD? Aren't we talking about RAM?
SPD stands for Serial Presence Detect and is a small EEPROM chip that exits in all memory modules since the mid nineties. Wikipedia has a very nice article about it and for me the two key sentences were these:

"The SPD EEPROM is accessed using SMBus, a variant of the I²C protocol. This reduces the number of communication pins on the module to just two: a clock signal and a data signal."

Looking at the schematic we can see the signals in the Dual Inline Memory Module connector pins 195 and 193:

Which leads us to page 19 and to the corresponding pins in the ICH3-M.

Sub-chapter 5.17 and chapter 12 describe this device in detail. Quite helpful also, for reference, is the applicable SMBus standard.

So now I knew I also had to include some more stuff in the kernel

This option needed to be selected...

...along with this one

Another reboot later... Did it work?

Nope. The manual says I should see a 00:1f.3 device but it's just not there

What was I doing wrong? Would I ever figure it out???
Something must be wrong with the driver, right? Actually with the whole PCI enumeration stuff.
So I checked the linux kernel documentation. There's a file called 'i2c-i801' in the Documentation/i2c/busses/ folder which states this...

Hidden ICH SMBus

If your system has an Intel ICH south bridge, but you do NOT see the
SMBus device at 00:1f.3 in lspci, and you can't figure out any way in the
BIOS to enable it, it means it has been hidden by the BIOS code. Asus is
well known for first doing this on their P4B motherboard, and many other
boards after that. Some vendor machines are affected as well.

The first thing to try is the "i2c_ec" ACPI driver. It could be that the
SMBus was hidden on purpose because it'll be driven by ACPI. If the
i2c_ec driver works for you, just forget about the i2c-i801 driver and
don't try to unhide the ICH SMBus. Even if i2c_ec doesn't work, you
better make sure that the SMBus isn't used by the ACPI code. Try loading
the "fan" and "thermal" drivers, and check in /proc/acpi/fan and
/proc/acpi/thermal_zone. If you find anything there, it's likely that
the ACPI is accessing the SMBus and it's safer not to unhide it. Only
once you are certain that ACPI isn't using the SMBus, you can attempt
to unhide it.

In order to unhide the SMBus, we need to change the value of a PCI
register before the kernel enumerates the PCI devices. This is done in
drivers/pci/quirks.c, where all affected boards must be listed (see
function asus_hides_smbus_hostbridge.) If the SMBus device is missing,
and you think there's something interesting on the SMBus (e.g. a
hardware monitoring chip), you need to add your board to the list.

I looked at this and thought, wow, this is scary. I don't want to unhide the controller and destroy the computer due to over heating.

I decided to drown my frustrations on something else, maybe some inspiration would come...

I picked up the schematic diagram again and played a game of spot the bus. SMBus is just a special case of the I²C and I found many of these in the motherboard.
The Super-IO has two I²C controllers, each controlling a pair of buses by means of multiplexing. The datasheet calls them ACCESS.Bus. I have no idea why... Here is the specification if you're interested.

Following the bus lines on the schematic diagram we see that these devices connect to the Super-IO chip:
- A system EEPROM (page 22)
- A thermal sensor for the CPU (page 39)
- The main battery (page 40)
- The extra battery, that can be installed in the cd-rom, floppy bay (page 40)
- The docking station connector (page 35)
- The mini-pci connector (page 36)

These devices connect to ICH3-M:
- The SO-DIMM memory connectors
- The ICS950805 Frequency Generator chip

The M7 ATI Radeon chip has independent I²C buses for
- LCD, VGA and DVI connectors
In this case the I²C bus is "renamed" DDC.

So, actually, the ICH3-M's SMBus only talks to the memories and to the Frequency generator chip. Reading the manual of this chip my understanding was that it is only programmed by ACPI when the computer needs to suspend or hibernate.

Back to the ICH3-M datasheet I found this register

I felt like I was finally getting somewhere. Let's see...

0x8749 = 1000 0111 0100 1001

So, bit3 is 1, the device is disabled and bit0 is 1, meaning that the IO space was left enabled.

Time to write another one of my programs...
The program clears bit 3 and then reads the PCI configuration space of the SMBus device which should now be visible.

#include <stdio.h>
#include <sys/io.h>
#include <sys/types.h>

u_int8_t pciConfigReadByte (u_int8_t bus, u_int8_t slot,
                             u_int8_t func, u_int8_t offset)
    u_int32_t address;
    u_int32_t lbus  = (u_int32_t)bus;
    u_int32_t lslot = (u_int32_t)slot;
    u_int32_t lfunc = (u_int32_t)func;
    u_int16_t tmp = 0;
    address = (u_int32_t)((lbus << 16) | (lslot << 11) |
              (lfunc << 8) | (offset & 0xfc) | ((u_int32_t)0x80000000));
    outl (address, 0xCF8);
    tmp = (u_int8_t)((inl (0xCFC) >> ((offset & 3) * 8)) & 0xff);
    return (tmp);

u_int16_t pciConfigReadWord (u_int8_t bus, u_int8_t slot,
                             u_int8_t func, u_int8_t offset)
    u_int32_t address;
    u_int32_t lbus  = (u_int32_t)bus;
    u_int32_t lslot = (u_int32_t)slot;
    u_int32_t lfunc = (u_int32_t)func;
    u_int16_t tmp = 0;
    address = (u_int32_t)((lbus << 16) | (lslot << 11) |
              (lfunc << 8) | (offset & 0xfc) | ((u_int32_t)0x80000000));
    outl (address, 0xCF8);
    tmp = (u_int16_t)((inl (0xCFC) >> ((offset & 2) * 8)) & 0xffff);
    return (tmp);

int main () {

  int i,j;
  u_int8_t bus, dev, func, reg;
  u_int32_t address;
  u_int32_t tmp;


  bus =  0;  dev =  0x1F;  func=  0;  reg =  0xF2;

  address = (u_int32_t)((bus << 16) | (dev << 11) |
            (func << 8) | (reg & 0xfc) | ((u_int32_t)0x80000000));
  outl (address, 0xCF8);
  tmp = inl (0xCFC);

  printf("\n   %08X\n\n",tmp);

  tmp = tmp & 0xFFF7FFFF; // This will clear bit 3

  printf("\n   %08X\n\n",tmp);


  bus =  0;  dev =  0x1F;  func=  3;  reg =  0;

  printf("\nBus:%02X Device:%02X Function:%02X\n\n", bus, dev, func );

  for (i=0;i<16;i++) {
    printf("%02X: ",i*16);
    for(j=0;j<16;j++) {
      printf("%02X ",pciConfigReadByte(bus, dev, func, reg++));

This is it, it worked


I have highlighted bytes 0x20 and 0x21 because they define the IO Space base address of the device.
So, it's at IO port 0x1200.

$ cat /proc/ioports shows it there

But is this even used by ACPI at all? Building on the knowledge acquired in the previous bug I looked for 1200 in the dsdt.dsl file that I had extracted before.

And yes, it is here, ACPI tells the linux kernel to reserve it as a motherboard resource and not to use this address space for anything else.

There is also the definition of an OperationRegion with variable names, sizes and a method that uses them.

So, ACPI uses the device but, as we have seen, physically, only the memories and the frequency generator are connected to it. The SPD data is only needed once at boot up by the BIOS routines that initialize the laptop. Changing whether certain clocks are active is only done at power state changes.

The next step is dangerous (AND IT WILL VOID YOU WARRANTY, LOL) in the sense that it may not be possible to securely suspend the laptop afterwards but I considered the risk low enough to try this.

You will notice that the kernel changed the port number to 0x1080 and assigned a driver to the device.

And 'decode-dimms' works now.

Just out of curiosity here are some BIOS routines...

The SMBus routine.

And the beginning of the MCH-M memory configuration routine


Here it is, in page 22.

It is a NM24c02. I have found the respective datasheet.
This is a very common chip, in fact it is also the same that was used by Kingston to hold the SPD data on the RAM DIMM.
All I know is that it connects to the Super-IO. I want to read the data in there, but haven't been able to find any information about this on the internet. There is this great blog post, particularly this part that seems relevant, but not much more.
The EC has one additional function. The ACPI spec allows for an i2c bus to be implemented through the EC, with EC registers mapping to i2c registers. The observant among you will realise that this means that there's an indexed access protocol being implemented on top of indexed access hardware, which is more layers of indirection than seem sane. For additional humour, this is usually only used to add support for ACPI smart batteries. ACPI batteries are generally abstracted behind a set of ACPI methods that provide information. Smart batteries instead speak i2c directly to the OS[2] for no real benefit. Linux handles these devices fine, and while the chances are you probably don't have one, the chances are also that if you do you haven't noticed.
My plan is to tackle this "bug" next when I manage to find some free time.

Debugger in the BIOS

This is the result of running the strings command in the BIOS kept at the low memory area addresses 0x000E000 and forward.

lfs [ ~/Compaq Evo N610c ]$ strings -tx fwh-000E0000-000FFFFF.bin | head -40
    960 G  b,#
    97c I  x8t
    983 IB x8t
    9de PCDB:
    9e5 MD %<
    a01 MM }?
    a0f MK ?A
    a55 BE $2I
    a5c BD     2I
    aac ^ Error
    ab5                                                                                                               G [<offset>]
    b30   Starts execution at the current CS:EIP. Breaks at optional <offset>.
    b77 G =<start> [<end>]
    b8a   Starts execution at <start> offset. Breaks at optional <offset>.
    bd1   Clears all breakpoints and starts execution at current CS:EIP.
    c13 T [<count>] [NOREGS]
    c28   Executes one instruction at current CS:EIP or executes <count> instructions.
    c77   Specifying NOREGS will turn off register dump for each instruction.
    cbd   SPecifying * for <count> will trace until a breakpoint is triggered.
    d04 TI [<count>] [NOREGS]
    d1a   Same as T but steps into INT calls.
    d40 P [<count>] [NOREGS]
    d55   Same as T but steps over CALL, LOOP and REP instructions.
    d92 U [<addr>] [<end>]
    da5   Disassemble code. If no selector, CS is assumed. Breaks at optional <end>
    df1   offset. Substituting a "$" for <addr> will start from current CS:EIP.
    e39 U [<addr>] [l<count>]
    e4f   Disassemble <count> lines (hex).
    e74 R [<reg> [<val>]]
    e86   Display/modify CPU registers. If <val> specified, it is written to <reg>.
    ed2 DR [<reg> [<val>]]
    ee5   Display/modify CPU debug registers.
    f0b SR [<reg> [<val>]]
    f1e   Display/modify CPU registers saved in SMI RAM.
    f52   Display CPU control registers.
    f74 I<size> <port>
    f83 O<size> <port> <val>
    f98   Input/output data, where <size> is B (8-bits), W (16-bits), or D (32-bits).
    fe6   Output to port, where <size> is B (8-bits), W (16-bits), or D (32-bits).
   1032 PI<size> <bus> <device> <func> <index>

It seems pretty obvious that there is a debugger / disassembler in the BIOS.
The question is: how can this be activated? How can I run it?

This was my hardest hack as it was the one that took the longest. But it was very interesting nonetheless.
It lead me to finding out this document. It contains a very nice reference of the PC architecture BIOS calls/interrupts.

Another cool document is this one, which is a PC-DOS / MS-DOS 3.30 technical reference, with a good introduction and
a description of all the interrupts and function calls. There is also a good description of the DEBUG command.

So, I was going around in radare2 trying to find any reference from the "normal" BIOS at addresses 0xF0000- 0xFFFFF
onto addresses in the 0xE0000-0xEFFFF range but couldn't find any... Then I looked at the disassembled code and tried
to find any piece of code that was relatively large and did not end up with POPs and RET. Also unsuccessful. Then I
thought about randomly jumping into code in that region and see what happened. This is why I started investing time in
understanding the DOS DEBUG command. Then by mere chance I was scrolling up and down and saw this...

So... Lots of ffff followed by lots of 0000 with some stuff in between.
A number of EAs in there, between the other numbers...
I had encountered EAs in disassembly before, so I knew these were long jumps.

It was time to boot into DOS and try them out.

This will start the DEBUG command, assemble a jump instruction in that memory location and then run it
(command g stands for go.)

Some of these locations would crash the computer, while others would just restart it and I felt like I was back to
square one.
I noticed, however, that when jumping to E000:8005, like shown above and just before the reset, the cursor would
change shape from an underline to a block. The blocky cursor would then also be present at the top of the BIOS boot
splash screen which did not happen before. After rebooting the computer N times and losing hope I simply pressed
the ENTER key at the splash screen with the blocky cursor and.... surprise surprise...

... I was presented with a prompt to the monitor/disassembler/debugger program in the BIOS.
This is more advanced and has more commands (including help screens) than the DOS DEBUG command but works in
a similar way. It allows you to disassemble, dump, look into register values, do ins, outs, even look into and change
values in the PCI configuration space. You can also define breakpoints.

If you are an operating system developer and/or if you want to hack boot sectors and/or earlier initialisation boot
code this is very useful.

   Update 7-Oct-2019

So, I made a boot sector like this...

... and added an entry in grub.cfg

menuentry "Compaq BIOS debugger" {
    set root=(hd0,1)
    chainloader /boot/debug-c.bin
Now, every time I restart the laptop I have the option of starting the debugger.

To spin or not to spin that is the question

As I mentioned in the other page about the computer, "Pentium4s ... get hot and spin the fans too often."
This is definitely true and I tried to understand the logic behind this and investigate why this is so.
So, I looked in the linux kernel sources and, buried there, there is a small program called tmon, in the tools/thermal/tmon/ folder. Compilation went smoothly and this is what I saw when I ran it.

I'll let the README explain what the program is and what it does.

Increasingly, Linux is running on thermally constrained devices. The simple
thermal relationship between processor and fan has become past for modern

As hardware vendors cope with the thermal constraints on their products, more
and more sensors are added, new cooling capabilities are introduced. The
complexity of the thermal relationship can grow exponentially among cooling
devices, zones, sensors, and trip points. They can also change dynamically.

To expose such relationship to the userspace, Linux generic thermal layer
introduced sysfs entry at /sys/class/thermal with a matrix of symbolic
links, trip point bindings, and device instances. To traverse such
matrix by hand is not a trivial task. Testing is also difficult in that
thermal conditions are often exception cases that hard to reach in
normal operations.

TMON is conceived as a tool to help visualize, tune, and test the
complex thermal subsystem.

What it does not explain is that the structures and values that the kernel keeps in /sys/class/thermal end up
there as a result of the processing of the ACPI tables.

So, what we see is that there are 2 thermal zones (i.e. temperature sensors) and their relationship with the
different "fans". We see also the current temperature reported by each sensor (bars near the bottom) and the
temperatures that trigger the fans, i.e. the trip point bindings.
There aren't really 3 fans, what we have is a fan that can rotate at 3 different speeds.

The letters A, P and C stand for Active cooling (fan,) Passive cooling (processor clock slow down) and Critical.
If you reach a critical temperature the system should shutdown to protect itself.

tmon has another function, it allows you to manually switch the fans on and off so manually select the speed as well.
What it does not let you do is change the thresholds.

So, we can see that, if the temperature is more that 50oC, then the fan will start at the lower speed, when it reaches
60oC it increases the speed and then at 84oC we get the higher speed.
On the way down, it will lower the speeds at 70oC, 55oC and will switch it off at 40oC.

I did a number of experiments, switching the fans on and off and realised that, completely idle, the processor
will naturally be at around 45oC or thereabouts. Left alone, the system will naturally spin the fan at the lower speed
almost constantly, it will hardly ever reach 39oC.

Some of the earlier manuals of the laptop don't even mention the 2GHz Pentium4 and it occurred to me that maybe
Compaq did not change anything in the ACPI tables to account for the different processors that ended up being
shipped with it.

So, as you can see in the backlight bug, I had already extracted and decompiled the ACPI tables and could look into the
source code. I could still not fully understand it due to lack of time and willpower to learn the language.
But that did not stop me looking on the internet for ways to change the ACPI tables.

One of the first links that I clicked led me to this forum thread. Lots of interesting and sad stories about people with really
buggy DSDT ACPI tables (DSDT is the table where the code is kept.)

Two things caught my eye here. The first was the mention of a _CRT value and the other that the temperature is in
Kelvin * 10.

I looked for _CRT in my DSDT

Sure enough, here it was.

0x0E58 is 3672 in decimal,
To convert from Kelvin to Celsius I subtracted by 2732 and divided that by 10 which resulted in 94oC.
This was encouraging because this was the temperature I could see in tmon.

Would it work the other way around? What value could I expect for 40oC, 55oC, etc...
As an example 40 *10 + 2732 =  3132 = 0x0C3C

Cool, it seems like all the temperatures I was looking for were there, from line 5564 to line 5579

Time to change them.

All that was left was to recompile the table with iasl and test it.

The kernel file /Documentation/acpi/dsdt-override.txt tells you how to do it.
(It directs you to this helpful article from intel.)

$ generate/unix/bin/iasl -tc -cr -vr dsdt.dsl

I then needed to edit the resulting DSDT.hex file and change the name of the table to AmlCode

$ cp dsdt.hex /sources/linux-4.9.135/include/DSDT.hex

I ensured that...


... compiled a new kernel and it worked!!!

It worked!! 

I had changed the thresholds successfully and the fan now spins only sporadically.

(Comments on this page are welcome - please email me at compaq(a.t.)edbatalha.info )