When the syscall number has to be loaded from a pc-relative location,
abuse END macros to place the number at the end of the syscall wrapper
rather than in the middle of it, so that there is no need to branch
around it; this saves two instructions per syscall number >= 128.
While there, also tweak the error return (SET_ERRNO_AND_RETURN) to only
return a 64-bit value for lseek; this saves another instruction for
all other syscalls.
With input from guenther@; "Anything that makes the machine faster" deraadt@