View unanswered posts | View active topics It is currently Fri Mar 29, 2024 3:25 pm



Reply to topic  [ 10 posts ] 
 zeroing out caller saved registers 
Author Message

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
I’ve been monitoring the LLVM newsgroup and come across the topic of zeroing out caller saved registers during a function call. I think this is primarily a security precaution, but I’d like to know more about it. I was wondering if it would be useful to have a store instruction that does this automatically. Saving caller registers is a reasonably frequent operation, and if they need to be zeroed out making it part of the store instruction would make things more efficient.

_________________
Robert Finch http://www.finitron.ca


Thu Aug 13, 2020 7:51 pm
Profile WWW

Joined: Tue Dec 11, 2012 8:03 am
Posts: 285
Location: California
I haven't seen the LLVM topic (and I'm not on the newsgroup anyway), but I often wonder what the justification is behind zeroing variables or registers before starting something. They say things like "Uninitialized variables are a big source of bugs." Well, true, but initializing them to zero is usually not any better than not initializing them at all. The correct initial value probably probably won't be zero; and zero can cause additional problems, for example if you try to divide by it. I've had iterative functions where the number you start with almost didn't matter because you'll be converging on an answer, and starting with a good estimate just reduces the number of iterations needed; but starting with zero, you never would finish. It might just crash, or get stuck in an infinite loop, or it might cause some sort of interrupt. I never do automatic initializations. Never. All my initializations are deliberate and appropriate for the situation.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources


Thu Aug 13, 2020 8:58 pm
Profile WWW

Joined: Sat Feb 02, 2013 9:40 am
Posts: 2095
Location: Canada
I think the concern was one of security and people being able to subvert software running based on values passed into the routine from the caller. The caller saved registers would be initialized by the compiler normally in the called routine. But if someone were to create a routine that used the values passed in before the registers are initialized, it might be a risk. You made a good point Garth that it can be detrimental to just initialize to zero without thinking of program operation. So, how about initializing to random values? What value could be used? 0xDEAD... Note that by initializing the vars on entry into the routine, they end up being initialized twice. Once to prevent security loopholes and a second time for proper program operation. If the first initialization could be done as part of the caller save, it would save on the instruction count.
Some compilers will spit out an error if a variable is used before it's initialized.

_________________
Robert Finch http://www.finitron.ca


Fri Aug 14, 2020 2:34 am
Profile WWW

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1780
Does sound mostly like a security thing to me. Zero isn't a bad value, as in many cases trying to follow a zero pointer will crash: the idea here is not to make bad code work, but to make it fail.


Fri Aug 14, 2020 2:02 pm
Profile

Joined: Wed Oct 23, 2019 1:52 am
Posts: 9
Location: Sacramento, CA
Yeah, it's security/stability paranoia. I can kind of see what advocates for it are getting at, but on the other hand, I'd personally file it under "should never be relevant if you're doing things right." The only place a system call should go is into the OS, and the only place a program should come back from is the OS - so, provided that the OS A. doesn't use uninitialized registers in its own code (which it should never be doing in any case) and B. restores each user thread to the exact state it was left in (which it should always be doing in any case) I can't see any way for it to be an issue. If an unused register is causing code to bomb, well, either it's not really an unused register, or you've got much bigger issues.

Edit: now, on further consideration, I can see a potential security issue if you're using a caller-saves convention in the possibility for accidentally passing back sensitive information (locations of system tables, etc.) in registers that the caller could then read - but that has nothing to do with initial values of registers, it's the cleanup after the OS function and before returning to the caller that would be the important part.

That said, if a store-and-clear instruction doesn't cost you too much, it's not necessarily a bad thing to have.

_________________
"'Legacy code' often differs from its suggested alternative by actually working and scaling." - Bjarne Stroustrup


Tue Aug 18, 2020 3:00 pm
Profile WWW

Joined: Mon Oct 07, 2019 2:41 am
Posts: 585
I think this a minor point. Look at windows. How many 'secret' redirections it has to system function calls and DLL's, that to know what say reg 23 is of 0x0000:4310 is minor." We will just format drive C, if you don't send 5000 yuán to the east west bank of china today" seems like a more dangerus problem, of new buggy features added with every upgrade.
Ben.


Tue Aug 18, 2020 8:29 pm
Profile
User avatar

Joined: Fri Mar 22, 2019 8:03 am
Posts: 328
Location: Girona-Catalonia
I used to be on the LLMV group, but I left it some time ago because it just filled mailbox with junk or uninteresting (for me) stuff. So I have not had the opportunity to read this particular thread, but just the title suggest me that this is unfounded security paranoia. I think this should be a relatively minor point. Some high level languages default to 0 variable initialisation, and some (older ones) do not. The programmer should be aware of it and make proper use of it. For performance reasons, the C languages does not explicitly initialises variables, the programmer must explicitly do it. At best, the compiler should warn about variables being used before being assigned a value, but that's it all to me. C is C and it should remain as is. More recent languages specify that all variables are automatically Zeroed upon declaration, this is what programers perceive, but in reality compilers are clever enough to do so ONLY for variables that are used before assigning a value to them, so I'm ok with that as well. Now, going that far as to make the processor doing it in all cases or having explicit instructions for that, is unnecessary overhead with little or no advantages in my opinion.


Sun Aug 23, 2020 9:27 am
Profile

Joined: Thu Dec 05, 2019 7:53 am
Posts: 13
Location: Tokyo, Japan
Adversaries love attitudes such as, "This is unfounded security paranoia" and "It's a minor point"; this is exactly how unanticipated holes are left in systems for adversaries to take advantage of. Remember, an attacker needs to find only one insecure point of entry in your entire system in order to take advantage of you. Focusing on just the "bigger" problems leaves them plenty of avenues to try to gain access.

This may be only tangentally related, but I do explicitly with uninitialized registers in the tests for my MC6800 simulator, and this is a case where hardware support for this kind of thing wouldn't help at all. The issue is that when I have an uninitialized register, I need to load it with either:
  • an unlikely value, if it's an untested register, in order to ascertain that the register is not changed by the code under test (CUT), or
  • a value that's not the tested result value, if the register is tested for a particular value after returning from the CUT.

So, for example, if a simulated instruction is supposed to set the carry flag, and I test that, the test harness sees this and knows to clear the carry flag before executing the simulated instruction. Otherwise the carry flag is set to a specific value and the harness asserts that it still has that value when the code under test has completed running. (Ideally a test that doesn't check the carry flag should be run twice, once with the carry flag cleared before running the CUT and again with the carry flag set. But stuff that touches flags usually have all flags tested, so I don't go quite this far. Untouched registers are easier because I can pick an arbitrary value, such as $A5 for the A accumulator, and if the CUT touches it unexpectedly it's a lot less likely to be writing that exact value to it.)

I'd like to extend this to my unit tests for the assembly code routines I run on the simulator, but that's a fair amount of work that I'm not really sure how to approach yet.

_________________
Curt J. Sampson - github.com/0cjs


Thu Dec 24, 2020 1:52 pm
Profile

Joined: Wed Jan 09, 2013 6:54 pm
Posts: 1780
In the context of chip simulation, if using a two-state simulator which doesn't handle the 'unknown' value, it's common to run a regression suite will everything initialised to ones, to zeros, and to random values. Of course it doesn't catch everything. In the context of fuzzing an application, one might inject likely-meaningful values like zero, one, minus one, as well as random values.

Once you've got automated tests, running them four times over (once in a while) may not be too much hardship. In the chip world, it was common to have a short set for running pre-checkin, a longer set to run overnight, and something more substantial to run over the weekend. Tests which run while you're asleep, or more generally which run while you're not waiting for them, can be rather handy.


Thu Dec 24, 2020 2:07 pm
Profile

Joined: Thu Dec 05, 2019 7:53 am
Posts: 13
Location: Tokyo, Japan
BigEd wrote:
Once you've got automated tests, running them four times over (once in a while) may not be too much hardship.

Running the tests for something as simple as an emulator for a 1970s CPU is certainly no hardship at all; even though the emulator's written entirely in Python, the 500 or so test cases run in under five seconds on a very slow (by 2010 standards) Pentium CPU. (Those are the tests just for the CPU core; the memory subsystem etc. is another couple of hundred.)

The problem is writing the tests. You might think that automated test case generators would be able to help with that, but Python's only static type is "object," which doesn't give much for a test case generator to go on.

_________________
Curt J. Sampson - github.com/0cjs


Thu Dec 24, 2020 2:17 pm
Profile
Display posts from previous:  Sort by  
Reply to topic   [ 10 posts ] 

Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software