----- re: adapting v0.23d changes for a full release package
It is done and I will release it shortly, but there are a few other posts I need to make here first. (I really am setting a dreadful example for the newbies when it comes to double-posting in release threads... :lol:)
----- re: 'double-patching' of DISPLAYx registers being contrary to Sony's intended use of them
I think a toggle would be best for now (though I still haven't added it in the new release). It might be possible though, to add automatic recognition of separate patching needs in the future.Quote:
Yes, now you mentioned, I remembered that.
Thinking on terms of Pareto Analysis (the 80:20 rule), we could keep this double patching for DISPLAYx registers. Or, better, on near future we could implement a toggle for it into OSD, leaving this choice to user.
Rectangle circuit 1 and circuit 2 are almost identical, so we can't rely on either one to always define the full screen area, as it might be the other one instead. But we CAN nearly always rely on the larger one of the two to define the full area (except directly after a switch to a lower resolution). So if both are used and one is much smaller than the other, then that is a strong indication that it is just an overlay picture intended to be mixed into the main full screen picture generated by the other circuit.
But there are additional complications involved too, which need to be considered before a real implementation is made.
Well, there is the method I mentioned above, whereby the larger of the two can be assumed to be the main area, and scaled as presently done, while a smaller rectangle is then assumed to be an overlay for the larger, and scaled according to it. This would probably improve the appearance of some games. But as usual any major change like that would probably cause other games to fail, so great care needs to be taken to avoid that.Quote:
It seems that we have no more options for the moment...
Another method that still remains to be tested is the 'caching' of scale and positioning patch values I mentioned in an earlier post, so that new calculation of patch values only needs to be done when the real 'input' values have changed. This should eliminate the timing problems with games that poke the DISPLAYx values 'too often' (like in every VBlank period).
I'm not really 'satisfied' myself, but some of this information is very hard to get, as the official development manuals do not contain any information on the most crucial GS registers (details on pixel clock generation etc). Apparently that info has been classified "for Sony eyes only"...Quote:
Yes, and that further complexities are far away beyond my knowledge. For I am satisfied with all this GS / GSM crazy stuff... ;-)
Now for some additional comments on the v0.23d changes you made
Yes, I knew that some of these were redundant. RAM accesses in particular are supposed to be single-cycle operations, as the RAM is integrated with the EE in the same chip. But I'm always worried about I/O device and coprocessor registers. Some of those are definitely NOT single-cycle, and it may not be a good idea to modify the source register that is being stored to such a special register before that storage operation has completed (in parallel to ongoing processing). Hence my tendency to use 'sync' even where it might not always be needed (like when other code delays ensure completion).Quote:
Summing up:
- Some "sync" instructions were commented right after "mtc" and "lq" ones.
I've accepted this change for the new version, but I don't understand the benefit.Quote:
- Some "dsrl" instructions were replaced by "srl".
The only real difference between them (since we don't really care about the upper bits) is that "srl" will fail to work at all if the upper bits of the source data are not 'validly' signed, with all 32 of the bits 32 through 63 being identical to bit 31. What is the advantage of that ?
"dsrl" on the other hand has no requirement at all on the input data and will therefore always perform the shifts correctly. And on a MIPS processor a 64-bit shift is no more 'expensive' than a 32-bit shift, since both are single-cycle operations. Thus we may as well use the operation that is guaranteed to always do the job, and never go on strike, like "srl" will do if it doesn't 'like' the upper bits of the input data.
For our case either instruction should work, as the input data should be properly signed, but I tend to prefer "dsrl" simply because of the reasons stated above.
I'm still not sure of this, but that is mainly because of the rotten documentation of the "ERET" instruction. Exactly what this will do to the current processor state is not explicitly stated anywhere in the "EE Core Instruction Set Manual". (at least I have not been able to find it) So I still do not know how or even if it will restore the original mode that applied before the exception. But the new release has your change in this.Quote:
- "Set user mode on" code on "DisplayHandler_Final_Exit" section was commented, since when the processor takes a level exception, the processor switches to the kernel mode, and the control is transferred to the applicable service routine (in our case, DisplayHandler routine). So it seems a good idea the DisplayHandler leave the processor mode unchanged.
Ok, but did you have any special reason for changing the offset by 32 bytes ?Quote:
- The $k0 and $k1 MIPS regs are being stored/restored now to/from -0x10 and -0x20 relative addresses.
No. That is completely wrong. Label 94 has nothing to do with the register selection, as it is a label near the end of the DISPLAYx-specific code.Quote:
- The "select case" on the beggining of DisplayHandler routine now has the same order of precedence of the four GS registers patched (SMODE2, DISPLAY2, DISPLAY1, SYNCV) on "94" ASM label.
The only change I can see in the identification of those four registers is that you moved the tests for each register around, so they are now tested in a different sequential order with SMODE2 first and SYNCV last. But giving them all the same precedence is impossible with sequential testing. To do that we would need to use another jump table (then all entries really do have the same precedence), but doing so for just four entries would not be efficient. The overhead cost of jump table handling would waste more time than the full chain of sequential comparisons.
Best regards: dlanor

