New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recursion instead of goto #3
Comments
Why hello! Thank you for asking this most excellent question! I have indeed considered alternatives to the goto. I have evaluated them to a great extent, and I am happy to present the results to you here. When the PHP parser reads a source file, that source code is compiled down to a series of opcodes, that will then be executed by the Zend (tm) (r) Engine. The compiler does some basic optimizations, but is really quite stupid. And so, depending on what code you write, it will generate different opcodes. This has a direct performance impact. There are several way in which a loop can be written. Let's start with the one you have proposed, the recursive call.
When you give this code to the PHP compiler, it will generate these opcodes:
As you can see, it is generating 24 instructions. The most expensive portion of this is the function calls, since the arguments need to be sent individually, and there is an additional instruction (DO_FCALL_BY_NAME) for the actual function call. There is no reason why this would be necessary. As described by Steele in his paper Lambda: The Ultimate GOTO, tail calls can be compiled to very efficient instructions. The PHP compiler however, does not take advantage of this technique, so calls are quite costly. Let's try and improve this. By using a while loop.
Here is what the compiler gives us:
This already looks a bit better. But there is a rather inefficient FETCH_CONSTANT instruction right at the top. This requires doing a namespace lookup against This gets rid of the FETCH_CONSTANT call, and puts the boolean
But JUMPZ with argument true is a redundant expression. true is never zero. So ideally we would simply eliminate this check. PS: So can we eliminate the redundant jump? Let's try a
Awesome! The extra initial JMPZ has been removed! But, it comes at the cost of a JMPNZ at the end. So it will be more efficient for the success case, but for the retry, we will still do redundant checks for a jump that should be unconditional. And there is a way to eliminate that final conditional, by using the excellent goto feature built into PHP. It produces the identical opcodes as do...while, but makes that final jump non-conditional! And there you have it. This is the most efficient solution to do non-conditional loops in PHP. All other options are simply too slow. |
Igor will you marry me <3 |
SCOTT!!! :( |
BUT LOOK AT THIS MAN |
@michellesanver Don't worry, I sit next to him and I was first! |
Thanks for this explanation, your PHP knowledge is amazing o_O |
Hello @iGore |
So, you have to rename your Twitter account igorgoto, it will be more efficient... :) |
This is the best justification for using "goto" in real life ever. Nicely done Igor. |
Why not manually unroll the loop a few times. Most invocations will be for small n, so you might eliminate jumps alltogether. |
+1 for marrying me too :) |
@igorw You should write a book "Answering GitHub Issues The Right Way". |
Thanks for the explanation, but I'm unclear as to where the do...while approach falls down. Where would the overhead be on something like:
|
@acleon I was a bit confused when I was reading this too, if you look at the actually code in the repo: |
Have you actually benchmarked this? You do realize that the cost of network operations is several orders of magnitude larger than the cost of a few extra opcodes? This is a textbook illustration of premature optimization that gains almost no optimization and sacrifices lots of maintainability in the process. If your purpose is purely academic, then go for it, I suppose. Otherwise... |
...otherwise you can still go for it since implementation is fairly simple and consists of only several lines of code, so it is not a big deal. |
So readability and maintainability are "not a big deal"? |
@jcoleman I don't see any readability nor maintainability issues in this project. Do you? |
@davestevens @igorw are conditional jump operands ( |
For reference: https://files.ifi.uzh.ch/rerg/arvo/courses/kvse/uebungen/Dijkstra_Goto.pdf This is a great demonstration of why so many computer science people look down on the PHP community. Your justification is basically "why not", when my point is that "it doesn't actually gain you anything. |
@igorw since you're an internal PHP developer, can't you do something about this? |
Any improvement which is not done to the worst performing step is not an improvement at all. The code is as fast as the slowest step. So irrespective of this detailed analysis, you need to first show that this loop is the slowest thing in the program. The whole point is you should have strong reasons to go against the standard practices of good programming. |
@jcoleman Peronaly, I never use gotos, because such tiny performance improvements does not change anything in apps I write.
I don't see any network operations in this project, so your argument is not apply-able here. |
The readme specifically states that the primary use-case for this project is for retrying network operations. Where else would you use this construct? |
@jcoleman Hehe, right. I agree, in case of retrying network operations, this tiny benefit is irrelevant. |
@jcoleman @umpirsky Yeah, but your point remains: this code is not unmaintainable. You can complain that it doesn't fit a style standard, but that's about as far as @jcoleman is going to get here. Particularly since the GOTO is restricted just like it is in C (where it is used heavily and where it is even considered good style for things like error handling), which means most of the complaints that Dijkstra is making aren't even applicable here. |
This is not C. Is there another language where GOTO is considered good style? Not in common languages. |
Semi-related: I thought you could not namespace the following constants
Though this php eval says you can for any version of PHP that supports namespaces (though not on HHVM). The best part is that if you don't use |
Very good reply! 👍 |
@bloodyowl says the guy whose entire life depends on not blocking a single thread... |
I don't understand the "this is a micro library so one little goto doesn't count the same as it otherwise might" argument. "It's 19 lines, get over it @jcoleman" Any well designed codebase will have plenty of things encapsulated into little one job routines like this. You should be working with 20 lines of code, give or take, at a time, all the time. The same "well surely a goto is fine in this little ol' function...for speed" argument would apply constantly, and that's how in the big picture you wind up with an unmaintainable turd sandwich. WE are the expense. WE should strive to do everything we can to ensure everything we do can be maintained as easily and painlessly as possible. If that isn't the fastest possible code, then spend a comparative pittance on upgrading the sever. Recursion should probably be avoided where a loop is just as painless and readable, but being worried about the "cost" of a loop is, IMO, silly. There are almost certainly always better places for you to invest your time than avoiding the cost of a bleeping while loop in anything you'd be using PHP* to accomplish. *Edit: This was not a "bash PHP" point. But you're not writing a first person shooter or other such thing where wringing every clock cycle possible out of a predefined system is important, in PHP. If all the "extra" memory of all the while loops you use is effecting your server, then spend $50 on some RAM. Don't spend your time on stuff like this. |
Has anyone here made a pull request to change this to a faster method? |
now i know how i can get womens |
@vita10gy yes... this isn't about high level coding style guides... it's just a low level view into how PHP CAN be optimized. in all my years writing PHP, i think i have maybe 2 goto commands in the million+ lines of code i've written, and they were in a compiler for a dynamic language that lives side by side with PHP. using the goto was by far the best solution speed wise and maintainability wise. i'm passionate about this because the "never use goto" camp is strong, and i don't want to see it deprecated. there are very few use cases where it makes a lot of sense, but they do exist. BTW, thanks for submitting all those portals in phoenix park ;) we live close. |
@cloudward You ENL or RES? Edit: also, as far as this goes, I'm not sure you can in one breath say that you've used goto 2 times in all your years, and in the next worry that the 'no goto ever' brigade is trying to rid the world of a useful tool. Almost by definition something can't be almost never useful and super useful. I think you could make a good argument that even if it has a very rare proper usage, (which while I can't think of any, I'm sure it does), and lots of ways to completely misuse it, that it would be for the greater good to just not have it. I mean, I can think of a couple use cases why rigging the windows in my house with explosives would be a good idea. All that really matters is if those use cases are worth all the times I blow guests or my wife up. |
@cloudward I have to say that I'd totally agree with most of that. I'm not advocating eliminating GOTO, and I'm not advocating never using it (remember, I've used GOTO in production too). If the purpose of this code usage was purely and exercise in how to do low-level optimization in PHP, then I think that's cool too. I just don't think that it's necessary (or that the speed it potentially gains is relevant) here in this use case. |
@vita10gy goto is like nuclear weapons... if you're using one, you better have a good reason, and most of the people who have them in good faith plan on not using them, but they're still building more and more of them. i'm ENL |
Figures, ENL would be GOTO lovers. ;) |
@jcoleman i added in the do while to the test, and used as many gotos as i could. the case in point is "n = 0"... where in wall time goto and do while are often exactly equal, and sometimes either one of them are faster. that directly relates to your post about adding a useless MOV to significantly speed up a system... there might be timings that are ripe for better overall performance, if everyone is doing things as conservatively as possible, that might block everyone from doing anything productively. something about good will hunting wanting a blonde. but technically, in the n = 0 case (which is expected to happen most frequently), the goto does a single less thing at the lowest possible level. <?php
$n = 1000;
$a = 1;
start:
$start = microtime(true);
try {
$a = 1;
while(\true) {
if($a > $n) {
throw new Exception('done');
}
$a++;
}
} catch(Exception $e) {
}
$after_while = microtime(true);
try {
$a = 1;
start_goto:
if($a > $n) {
throw new Exception('done');
}
$a++;
goto start_goto;
} catch(Exception $e) {
}
$after_goto = microtime(true);
try {
$a = 1;
do {
if($a > $n) {
throw new Exception('done');
}
$a++;
} while(\true);
} catch(Exception $e) {
}
$end = microtime(true);
echo "n = $n\n\twhile " . ($after_while-$start) . "\n\tgoto " . ($after_goto-$after_while) . "\n\tdo while " . ($end-$after_goto) . "\n\n";
$n = $n / 10;
if($n >= 1) {
goto start;
} elseif($n > 0) {
$n = 0;
goto start;
}
?> |
n = 1000 n = 100 n = 10 n = 1 n = 0 |
Advice from the past : http://www.reddit.com/r/PHP/comments/1uuc34/quick_test_to_see_if_array_walk_is_better_than/cen5ccs :-P |
@tamasimrei the code i pasted was indented with tabs, but github seems to ignore them. i tried wrapping in |
@cloudward you start and end your code with triple-backtick to turn it into a codeblock as such:
If you're going to show the world the importance of your argument, you might as well do it with syntax highlighting enabled! |
@jgoux Holy shit that was my thread! |
In case you need some more reasons why the GOTO statement is a feasible option for performance critical code such as the code in this library, I can highly recommend this paper by Knuth: |
This is a joke right? |
"Performance critical code" such as saving an instruction or two when On Tue, Oct 28, 2014 at 8:40 PM, vita10gy notifications@github.com wrote:
|
here from this post |
After all this, I just wanted to tell you that you do not always have to follow your textbooks. It makes totaly sense and if you cant read/maintain that single goto, just learn PHP. |
That's an old thread, but I'd like to comment some of the @jcoleman replies. He cited the famous Edgar Dijkstra letter, that criticized "The unbridled use of the go to statement" and remarked that maybe it should be restricted "to alarm exits". Linux has lots of goto statements for that. Remember that Donald Knuth replied Dijkstra through the also famous paper "Structured Programming with go to Statements". Supposedly Dijkstra wrote: "Please don't fall into the trap of believing that I am terribly dogmatical about [the go to statement]. I have the uncomfortable feeling that others are making a religion out of it, as if the conceptual problems of programming could be solved by a single trick, by a simple form of coding discipline." Well, the aphorism "premature optimization is the root of all evil" was written (and maybe invented) by Donald Knuth. For instance, in "The Art of Computer Programming" (1974): "Another important aspect of program quality is the efficiency with which the computer's resources are actually being used. I am sorry to say that many people nowadays are condemning program efficiency, telling us that it is in bad taste. The reason for this is that we are now experiencing a reaction from the time when efficiency was the only reputable criterion of goodness, and programmers in the past have tended to be so preoccupied with efficiency that they have produced needlessly complicated code; the result of this unnecessary complexity has been that net efficiency has gone down, due to difficulties of debugging and maintenance. The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming." If the idea is merely to cite famous works, well, Dijkstra is not dogmatic about "goto" statements and Knuth clearly favors optimization. They're just against abuses. It's trivial to understand the code and it's very efficient, specially compared to slow recursive functions in PHP. They're two wins specially in a library. I don't understand what's the fuss, except dogmatically avoid a tool that is available and was, seemingly, correctly used. |
@pedroac Please read my actual comments (like #3 (comment) ) and note that I am not dogmatically opposed to the use of But the speed difference between recursive functions and goto in the intended use case (a web request) is! mathematically speaking, purely statistical noise. Oh, and if we’re that concerned about speed...probably PHP is the wrong language to use. |
Perhaps PHP isn't the wrong language to use to figure out the most efficient way to do something in PHP. :) |
May I ask if there is a reason why you prefer using goto instead of function recursion?
The text was updated successfully, but these errors were encountered: