Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have Joomla play nice with reverse caching proxies like Varnish, Nginx etc. #10373

Merged
merged 4 commits into from May 10, 2016
Merged

Have Joomla play nice with reverse caching proxies like Varnish, Nginx etc. #10373

merged 4 commits into from May 10, 2016

Conversation

fevangelou
Copy link
Contributor

@fevangelou fevangelou commented May 10, 2016

This is an update for pull request #7677

Description

Reverse caching proxies like Varnish, Nginx etc. (also known as web accelerators) are built to cache the entire page output of a website and thus speed up delivery significantly. Such software works much more efficiently to how page caching is performed internally in Joomla, as PHP & the database (the usual performance bottleneck) are invoked only when the content is required to be constructed and not every time it's served.

The use of such software is not only meant to be used by extremely high performance sites, as even sites with low or moderate traffic can benefit from the speed improvement and thus rank better in search engine results.

Joomla unfortunately cannot work out-of-the-box with reverse caching proxies. This limits Joomla from being easily setup in high performance scenarios without additional custom coding or extensions, unlike other CMSs like WordPress, which already provide the means for reverse caching proxies to operate better with.

There are currently 3 problems that prevent Joomla from working out-of-the-box with reverse caching proxies:

a) sessions for logged in users cannot be easily identified by that software, in other words, a reverse caching proxy is unable to "know" if a user is logged into Joomla and therefore cannot decide whether to serve cached or un-cached output. This obviously has undesired effects where guest users could see sensitive data cached from a previously logged in user.
b) the "remember me" cookie cannot be identified as well (because it's a hashed user agent string) and thus the related functionality is simply broken, even if we fixed (a).
c) Identifying which pages NOT to cache, even when the above conditions are never met. Such a page is the typical user login page and the problem comes from the token used in the login form. Because the token is created from a guest user's session cookie, if the login page (or any page that requires user login - e.g. a user module, a members area etc.) has been previously cached by the reverse caching proxy, then there will be a token mismatch between what gets submitted to the form and the token the actual user would have generated by Joomla if there was no caching implemented at all. That's because reverse caching proxies cache some page from the first visitor that accesses that page. So when a 2nd visitor comes in, the page would essentially display what Joomla generated for the first visitor. And in the case of forms, the 2nd visitor would get the 1st visitor's session token printed on the html of that form. A token mismatch means the visitor could not login without re-submitting the form, which translates to unnecessary user experience issues.

The solution is overall simple and it requires the use of some identifiers in the form of cookies or HTTP headers:

For (a), all we have to do is set an identifier cookie only when users log in. This cookie only acts like a flag telling reverse caching proxies that as long as this cookie is present, the output of the page should never be cached. No sensitive data is exposed and this cookie is destroyed when the user logs out.

For (b), all we have to do is add a prefix to the existing cookie created by Joomla, which is currently a hashed user agent string. This allows reverse caching proxies to check for this prefix and turn off any caching, similar to how it should be done for when users log into a Joomla site.

For (c), the users component (com_users) and the entire Joomla /administrator path should emit an HTTP header which would instruct the reverse caching proxy to NOT cache content at all and therefore allow the Joomla form token to be created from the actual guest attempting to log in. These are the 2 most important user entry points in Joomla. Unfortunately, we cannot count the user login module to this change, because if that module is used on every page of the site, then the HTTP header to prevent caching would emit everywhere and thus beat the point of caching overall. It's a case by design which must be addressed by the developer of each site. As an example, see how we handle user logins at http://www.joomlaworks.net which uses Varnish. You'll notice that every user login leads to a specific page only. This page is excluded from being cached by Varnish and therefore user logins work flawlessly.

The proposed PR currently fixes (a) and (b) which is the most important aspect of Joomla's out-of-the-box interoperability with reverse caching proxies. Issue (c) is less important as it can easily be addressed using an existing boilerplate configuration for Varnish, Nginx etc. When I have a user-configurable solution for this, e.g. a flag in the Joomla menu system to exclude pages from such caching by emitting a special HTTP header, I will make a new PR with required code changes.

To sum things up:

  • This is a performance related improvement for Joomla.
  • It will allow even smaller sites (in terms of traffic) to benefit from the speed improvement and thus help rank better in search engines.
  • It has no impact in existing Joomla sites (aside the fact that the current "remember me" cookie will simply be reset for users on the next update of Joomla - which doesn't really constitute a b/c issue).
  • It has no security impact and no sensitive data are exposed. The main prefix "joomla_" used in the cookies added/affected does not pose a security risk as there are many other ways a Joomla site can be identified and even easier (e.g. by loading /robots.txt and checking the excluded paths or /configiration.php and checking the Joomla-distinct message printed - and the list goes on...).
  • 3 plugin files are patched to provide a solution for (a) and (b) as described above.

Thank you.

P.S. This is the configuration for Varnish v3.x to detect the "user state" and "remember me" cookies in Joomla properly (it's really typical cookie detection in Varnish):

sub vcl_recv {
    [...other rules...]
    if (req.http.Cookie ~ "joomla_") {
        return (pass);
    }
    [...other rules...]
}

and

sub vcl_fetch {
    [...other rules...]
    if (bereq.http.Cookie ~ "joomla_") {
        return (hit_for_pass);
    }
    [...other rules...]
}

A full example can be found here: https://gist.github.com/fevangelou/84d2ce05896cab5f730a

Summary of Changes

  • Added the prefix "joomla_remember_me_" wherever the related cookie name is set in /plugins/authentication/cookie/cookie.php
  • Added the prefix "joomla_remember_me_" wherever the related cookie name is set in /plugins/system/remember/remember.php
  • Modified /plugins/user/joomla/joomla.php to include the setting/unsetting of the "joomla_user_state" cookie to be used as a flag for when a user logs into Joomla and effectively instruct any reverse caching proxy to NOT cache the site's output.

P.S. The use of "joomla__" (instead of "j__" as naming conventions are usually used in Joomla) was done for clarity. WordPress for example uses "wp_" or "wordpress_" and other CMSs similar more-than-one-letter naming conventions. If we used just "j_*" that might have undesired effects as the cookie name match may be more common.

Testing Instructions

To see the full benefits, use Joomla installed on a LAMP stack with Varnish 3.x and this configuration as a starting point: https://gist.github.com/fevangelou/84d2ce05896cab5f730a

Otherwise just get the update, open up your browser's dev tools and inspect the cookies set when you log in and out and when you tick the "remember me" checkbox in the login forms.

/cc @Hackwar @brianteeman

@andrepereiradasilva
Copy link
Contributor

didn't read all the post, but i think i know the problem since i have it to.

But one thing that i don't agree: adding "joomla" text in cookies it will make very easy to detect the server is using joomla.

Can we have a param to costumize that, like cookie_prefix (defaulting to joomla)?

@brianteeman
Copy link
Contributor

brianteeman commented May 10, 2016 via email

@andrepereiradasilva
Copy link
Contributor

There is also the language cookie in multilangue sites.
https://github.com/joomla/joomla-cms/blob/staging/plugins/system/languagefilter/languagefilter.php#L756

@andrepereiradasilva
Copy link
Contributor

That's security through obscurity which is no security at all

Actually is to try to avoid robots trying to check the sites.

@brianteeman
Copy link
Contributor

brianteeman commented May 10, 2016 via email

@andrepereiradasilva
Copy link
Contributor

I know that. But we can avoid some of the must common. That is just my opinion. Anyone is free to disagree.

@andrepereiradasilva
Copy link
Contributor

Nevertheless, just want to say that this PR is a good improvement.

@brianteeman
Copy link
Contributor

brianteeman commented May 10, 2016 via email

@andrepereiradasilva
Copy link
Contributor

@brianteeman as i said, is my opinion. We can agree to disagree on this. 😄

@fevangelou can you do the same thing for the language cookie?
https://github.com/joomla/joomla-cms/blob/staging/plugins/system/languagefilter/languagefilter.php#L756

@fevangelou
Copy link
Contributor Author

It's not required to have language specific cookies as Joomla uses
different URLs either way and thus there is no issue with multilingual
sites.

BTW in my description I clearly explain why there is no security issue by
using the joomla_ prefix. Furthermore it should not be user configurable as
this way it would beat the whole purpose of it in the first place.

@brianteeman is everything OK to be merged? I can then work on a solution
for problem (c).
On May 10, 2016 1:46 PM, "andrepereiradasilva" notifications@github.com
wrote:

@brianteeman https://github.com/brianteeman as i said, is my opinion.
We can agree to disagree on this. [image: 😄]

@fevangelou https://github.com/fevangelou can you do the same thing for
the language cookie?

https://github.com/joomla/joomla-cms/blob/staging/plugins/system/languagefilter/languagefilter.php#L756


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#10373 (comment)

@andrepereiradasilva
Copy link
Contributor

i read it now. You're only talking about the remember me cookie and a new cookie with the user login state.
I tough this PR was for the session cookie. Sorry i misunderstood.

@andrepereiradasilva
Copy link
Contributor

I have tested this item ✅ successfully on 856b264


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/10373.

1 similar comment
@dgrammatiko
Copy link
Contributor

I have tested this item ✅ successfully on 856b264


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/10373.

@dgrammatiko
Copy link
Contributor

Thank you @fevangelou, can you also provide the solution for the admin part as well? I am guessing some code in the JApplicationAdministrator in another PR

@joomla-cms-bot joomla-cms-bot added the RTC This Pull Request is Ready To Commit label May 10, 2016
@zero-24
Copy link
Member

zero-24 commented May 10, 2016

@brianteeman can we get a milestone here? ;)

@brianteeman brianteeman added this to the Joomla 3.6.0 milestone May 10, 2016
@zero-24
Copy link
Member

zero-24 commented May 10, 2016

I have just send a quick CS PR. https://github.com/fevangelou/joomla-cms/pull/1

$cookie_path = $conf->get('cookie_path', '/');
if ($app->isSite())
{
setcookie("joomla_user_state", "logged_in", 0, $cookie_path, $cookie_domain, 0);
Copy link
Member

@Fedik Fedik May 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should use Joomla $app->input->cookie->set()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use the J API to set a cookie? Let's not overcomplicate things for such
a simple thing.
On May 10, 2016 7:59 PM, "Fedir Zinchuk" notifications@github.com wrote:

In plugins/user/joomla/joomla.php
#10373 (comment):

@@ -234,6 +234,16 @@ public function onUserLogin($user, $options = array())
// Hit the user last visit field
$instance->setLastVisit();

  •   // Add "user state" cookie used for reverse caching proxies like Varnish, Nginx etc.
    
  •   $app           = JFactory::getApplication();
    
  •   $conf          = JFactory::getConfig();
    
  •   $cookie_domain = $conf->get('cookie_domain', '');
    
  •   $cookie_path   = $conf->get('cookie_path', '/');
    
  •   if ($app->isSite())
    
  •   {
    
  •       setcookie("joomla_user_state", "logged_in", 0, $cookie_path, $cookie_domain, 0);
    

I think it should use Joomla API $this->app->input->cookie->set()


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
https://github.com/joomla/joomla-cms/pull/10373/files/856b26494e9cf1c88c79f790495d65f28002f31c#r62710343

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, I think $app->input->cookie->set() would be better, yes it is easy but if someone checks in a plugin after this plugin was executed what is set in the cookie and use $app->input->cookie->get() he/she will get the correct value. CS fixes by @zero-24 are making also sense.
(finger hovering over the merge button) :-)

use $this->app & add one missing clean line
@joomla-cms-bot
Copy link

This PR has received new commits.

CC: @andrepereiradasilva, @dgt41


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/10373.

@joomla-cms-bot
Copy link

This PR has received new commits.

CC: @andrepereiradasilva, @dgt41


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/10373.

@fevangelou
Copy link
Contributor Author

Updated PR swapping setcookie with $this->app->input->cookie->set() as indicated by @rdeutz & @Fedik.

@fevangelou
Copy link
Contributor Author

@dgt41 I'll prepare that for another PR when I have a little more time, so I can also have the related Varnish and Nginx configs ready for use ;)

@rdeutz rdeutz merged commit 55bcf6b into joomla:staging May 10, 2016
@joomla-cms-bot joomla-cms-bot removed the RTC This Pull Request is Ready To Commit label May 10, 2016
@fititnt
Copy link
Contributor

fititnt commented Jun 30, 2016

This will be merged into the code of Joomla 3.6?

If is not sure, but it is only a matter of help test production or make adjustments, I am committed to help with this. Please, just let me know.

cc @bcdonadio

@fevangelou
Copy link
Contributor Author

It's already merged :)

Regards,
Fotis Evangelou

On 1/7/16 1:07 πμ, Emerson Rocha Luiz wrote:

This will be merged into the code of Joomla 3.6?

If is not sure, but it is only a matter of help test production or
make adjustments, I am committed to help with this. Please, just let
me know.

cc @bcdonadio https://github.com/bcdonadio


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#10373 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ABPdGzabJDAXL6gJ25oO-zwcNNT_G_1sks5qRD4vgaJpZM4IatKH.

@zero-24
Copy link
Member

zero-24 commented Jul 1, 2016

@fititnt you Can test with The RC1 of 3.6 https://github.com/joomla/joomla-cms/releases/tag/3.6.0-rc :)

@gachla gachla mentioned this pull request Aug 10, 2016
@fred-the-coder
Copy link

fred-the-coder commented Aug 20, 2016

Hi all,
To make the most of this PR, do we need to:

  1. perform a specific configurartion of our Joomla site? to use varnish cash?
  2. do we need to configure varnish using fevangelou customisation of default.vcl file?

Thank you for your feedback.

@fevangelou
Copy link
Contributor Author

@fred-the-coder everything you need is described in detail here https://gist.github.com/fevangelou/84d2ce05896cab5f730a (updated Sept 21st, 2016).

@iHRSd
Copy link

iHRSd commented Oct 18, 2016

@fevangelou
Hi all, great job, thanks for sharing this...

But there is very important case;
Exclude for module positions and signed-in users cache version, are requirement to complete the cache solution, Joomla and Reverse Cache Proxies (Varnish, Nginx FastCGI) ...

I'm testing for use is same time from Joomla File-Cache (or Memcached [Storage]) and Varnish (or NginxFastCGI).

To support everything together (maybe).

@azeemdwt
Copy link

@fevangelou : Where can I find a link for nginx configuration to implement this? Thanks in advance.

@fred-the-coder
Copy link

Hi all,

I set-up a mock-up for my customer here: http://ra.mosoft.fr

It is a site of news kind where trafic will be high.
Thus, I use Joomla 3.6.x (actually 3.6.2) to benefit from Varnish on my hosting environment.

But, I am afraid Varnish is not working...
I did run this command line:

curl -I http://simple.gandi-test.fr/
HTTP/1.1 200 OK
Server: Apache/2.4.18
X-Powered-By: PHP/5.4.45-0+deb7u2
X-Logged-In: False
X-Content-Powered-By: K2 v2.7.1 (by JoomlaWorks)
Expires: Wed, 17 Aug 2005 00:00:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Set-Cookie: 8c344a4d64c96915ff93f25331a22a1d=n8i2emphg2ohkqoik9suhslkn3; path=/; HttpOnly
Last-Modified: Mon, 21 Nov 2016 21:15:59 GMT
Content-Type: text/html; charset=utf-8
Vary: Accept-Encoding
Date: Mon, 21 Nov 2016 21:15:59 GMT
Connection: keep-alive
Via: 1.1 varnish
Age: 0

The reported "Age" is "0", meaning cache is not working...am I wrong?
If not, what is wrong here?

Thank you for your help.

@brianteeman
Copy link
Contributor

brianteeman commented Nov 29, 2016 via email

@fevangelou
Copy link
Contributor Author

@azeemdwt A good starting point for Nginx, containing the exclusions recently added in Joomla can be found here: https://github.com/engintron/engintron/blob/master/nginx/proxy_params_dynamic

@fred-the-coder Provided you're using the config file I'm suggesting here https://gist.github.com/fevangelou/84d2ce05896cab5f730a (which BTW I updated today with additional improvements for both Varnish 3.x & 4.x), the first hit on your demo site will correctly report Age: 0 if no other hit came before it. For every subsequent one (and for the TTL of the cache) it should provide a value. You can compare with joomlaworks.net (using curl as you did already) and see for yourself.

@azeemdwt
Copy link

@fevangelou : Thanks a lot for the link.

@zero-24
Copy link
Member

zero-24 commented Nov 29, 2016

I'm llocking here now. Any more questions can be moved to the forum: https://forum.joomla.org or the mailing lists as this is a bug tracker and no support forum ;). Thanks for understanding.:
https://groups.google.com/forum/#!forum/joomla-dev-cms
https://groups.google.com/forum/#!forum/joomla-dev-general

@joomla joomla locked and limited conversation to collaborators Nov 29, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet