Questions and Answers :
Windows :
4.19 \"No schedulers responded\" behind http proxy
Message board moderation
Author | Message |
---|---|
Send message Joined: 6 Oct 04 Posts: 5 Credit: 121,881 RAC: 0 |
Hi, I've recently upgraded my Boinc client to 4.19. Since installing the upgrade my machine has been unable to communicate with the cpdn servers and gives a "deferring communication" message. My machine is behind a http proxy, which is correcly specified within Boinc and worked quite happily under previous versions of the boinc client. Has anyone else seen this, and can you tell me how I can re-enable communication! Thanks in anticipation, Richard rich at richardlinney.co.uk |
Send message Joined: 2 Sep 04 Posts: 44 Credit: 372,682 RAC: 0 |
Hi Richard The BOINC dev team are trying to track down exactly what's going on with these proxy problems. Can you please advise which proxy application you're using and as much other technical detail as possible? Please use caution regarding the tech details and please don't post anything that you feel might compromise your security. Feel free to post excerpts from message logs, but you can hash out specifics that might be compromising. e.g. Internal IP of proxy = 192.1.1.1 Hostname of proxy = ####.ox.ac.uk Thanks |
Send message Joined: 6 Oct 04 Posts: 5 Credit: 121,881 RAC: 0 |
Hi, Thanks for your very swift reply... The proxy server is Novell BorderManager 3.7 - and I'm the IT Manager - so you can have as much technical information as you require - but obviously fqdn's and IP addresses will be spoofed... The Boinc client is simply configured to "connect via proxy server" and I've specified the hostname of the proxy server (I've also tried simply the IP address of the proxy server - and this doesn't work either) I'm also obviously specifiying the http proxy port, which is 8080. No authentication is required within the Boinc client as the machine authenticates to the proxy through Directory Services at logon, and is then transparent to all applications. As I say, there were no problems with previous versions of Boinc. Is there any other information I can provide to help? Thanks, Richard |
Send message Joined: 2 Sep 04 Posts: 44 Credit: 372,682 RAC: 0 |
> Thanks for your very swift reply... No prob, but can't always promise it'll be that way. ;-) > The proxy server is Novell BorderManager 3.7 - and I'm the IT Manager - so you ok. > Is there any other information I can provide to help? Are you in a position to capture the network packets being sent between the host and the proxy? If so, then you could examine the data part of the packet and see what reply is being returned to the host from the proxy. If not, (and maybe less complicated :) you can turn on logging in the BOINC client. This page <a href="http://boinc.berkeley.edu/client_debug.php">Core client: debugging</a> describes how to set up a file called log_flags.xml and what debugging options are available. <b>Note:</b> You need to stop and start the client after creating the flags file, in order for the client to detect the debugging info. There are quite a few debugging options and lots of data can be generated. I would suggest starting with <http_debug/>. The debug data will be in the stdout.txt file in your BOINC folder. You won't see it in the messages tab of the client. There's a size limit on the stdout.txt file, so if you leave the log_flags active for too long, the stdout file can wrap and useful data can be lost. It's not necessary (nor advisable) for you to post the entire contents of the stdout.txt file here, but have a look for the http_debug sections and see if anything there makes sense and post the useful details. If not, we can work out a means for you to send the stdout file to either myself of one of the devs, so that we can examine it. |
Send message Joined: 2 Sep 04 Posts: 44 Credit: 372,682 RAC: 0 |
> > Is there any other information I can provide to help? A couple more questions. Which Windows operating system is the client that is not working installed on? Which BOINC version did you upgrade from? TIA |
Send message Joined: 6 Oct 04 Posts: 5 Credit: 121,881 RAC: 0 |
Hi I can do a packet capture if necessary - but not this afternoon as I'm just between meetings! I have switched on the http and netxfer debugging, and a portion of the log is pasted below - hope this is useful, it doesn't reveal anything at all to me. I can't see any reference to the proxy - so is Boinc simply not trying to use the proxy? In terms of client, this is running on XP SP2. Thanks, Richard 2005-02-17 12:44:23 [climateprediction.net] Sending request to scheduler: http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi 2005-02-17 12:44:23 [DEBUG_HTTP ] HTTP_OP::init_post(): 00953A30 io_done 0 2005-02-17 12:44:23 [DEBUG_NET_XFER ] NET_XFER_SET::do_select(): queried 1, returned 0 2005-02-17 12:44:23 [DEBUG_NET_XFER ] CLIENT_STATE::net_sleep(0.000000) 2005-02-17 12:44:23 [DEBUG_NET_XFER ] NET_XFER_SET::do_select(): queried 1, returned 1 2005-02-17 12:44:23 [DEBUG_NET_XFER ] NET_XFER_SET::do_select(): write enabled on socket 596 2005-02-17 12:44:23 [DEBUG_NET_XFER ] NET_XFER_SET::do_select(): socket 596 is connected 2005-02-17 12:44:24 [DEBUG_NET_XFER ] NET_XFER_SET::do_select(): queried 1, returned 1 2005-02-17 12:44:24 [DEBUG_NET_XFER ] NET_XFER_SET::do_select(): write enabled on socket 596 2005-02-17 12:44:25 [DEBUG_NET_XFER ] NET_XFER_SET::do_select(): queried 1, returned 1 2005-02-17 12:44:25 [DEBUG_NET_XFER ] NET_XFER_SET::do_select(): write enabled on socket 596 2005-02-17 12:44:25 [DEBUG_HTTP ] HTTP_OP_SET::poll(): wrote HTTP header to socket 596: 233 bytes 2005-02-17 12:44:25 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: POST http://climateapps2.oucs.ox.ac.uk:80/cpdnboinc_cgi/cgi HTTP/1.0 2005-02-17 12:44:25 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Pragma: no-cache 2005-02-17 12:44:25 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Cache-Control: no-cache 2005-02-17 12:44:25 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Host: climateapps2.oucs.ox.ac.uk:80 2005-02-17 12:44:25 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Connection: close 2005-02-17 12:44:25 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Type: application/octet-stream 2005-02-17 12:44:25 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Length: 5638 2005-02-17 12:44:25 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: 2005-02-17 12:44:25 [DEBUG_NET_XFER ] NET_XFER_SET::do_select(): queried 1, returned 1 2005-02-17 12:44:25 [DEBUG_NET_XFER ] NET_XFER_SET::do_select(): write enabled on socket 596 2005-02-17 12:44:25 [DEBUG_NET_XFER ] NET_XFER::do_xfer(): wrote 5638 bytes to socket 596 2005-02-17 12:44:25 [DEBUG_HTTP ] HTTP_OP_SET::poll(): finished sending request body 2005-02-17 12:44:25 [DEBUG_NET_XFER ] NET_XFER_SET::do_select(): queried 1, returned 0 2005-02-17 12:44:26 [DEBUG_NET_XFER ] NET_XFER_SET::do_select(): queried 1, returned 0 2005-02-17 12:44:26 [DEBUG_NET_XFER ] CLIENT_STATE::net_sleep(0.000000) 2005-02-17 12:44:26 [DEBUG_NET_XFER ] NET_XFER_SET::do_select(): queried 1, returned 1 2005-02-17 12:44:26 [DEBUG_NET_XFER ] NET_XFER_SET::do_select(): read enabled on socket 596 2005-02-17 12:44:27 [DEBUG_NET_XFER ] NET_XFER_SET::do_select(): queried 1, returned 1 2005-02-17 12:44:27 [DEBUG_NET_XFER ] NET_XFER_SET::do_select(): read enabled on socket 596 2005-02-17 12:44:27 [DEBUG_HTTP ] HTTP_OP_SET::poll(): reading reply header; io_ready 1 io_done 0 2005-02-17 12:44:27 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: HTTP/1.0 200 OK 2005-02-17 12:44:27 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Date: Thu, 17 Feb 2005 12:42:13 GMT 2005-02-17 12:44:27 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Server: Apache/2.0.50 (Unix) PHP/4.3.8 2005-02-17 12:44:27 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Content-Type: text/plain 2005-02-17 12:44:27 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Content-Length: 799 2005-02-17 12:44:27 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: 2005-02-17 12:44:27 [DEBUG_HTTP ] HTTP_REPLY_HEADER::parse(): status=200 2005-02-17 12:44:27 [DEBUG_HTTP ] HTTP_REPLY_HEADER::parse(): content_length=799 2005-02-17 12:44:27 [DEBUG_HTTP ] HTTP_REPLY_HEADER::parse(): content_length=799 2005-02-17 12:44:27 [DEBUG_NET_XFER ] NET_XFER_SET::do_select(): queried 1, returned 1 2005-02-17 12:44:27 [DEBUG_NET_XFER ] NET_XFER_SET::do_select(): read enabled on socket 596 2005-02-17 12:44:27 [DEBUG_NET_XFER ] NET_XFER::do_xfer(): read 799 bytes from socket 596 2005-02-17 12:44:27 [DEBUG_NET_XFER ] NET_XFER_SET::do_select(): queried 1, returned 1 2005-02-17 12:44:27 [DEBUG_NET_XFER ] NET_XFER_SET::do_select(): read enabled on socket 596 2005-02-17 12:44:27 [DEBUG_NET_XFER ] NET_XFER::do_xfer(): read 0 bytes from socket 596 2005-02-17 12:44:27 [DEBUG_HTTP ] HTTP_OP_SET::poll(): got reply body 2005-02-17 12:44:27 [DEBUG_NET_XFER ] CLIENT_STATE::net_sleep(0.000000) 2005-02-17 12:44:27 [DEBUG_NET_XFER ] CLIENT_STATE::net_sleep(0.000000) ... etc |
Send message Joined: 2 Sep 04 Posts: 44 Credit: 372,682 RAC: 0 |
> I have switched on the http and netxfer debugging, and a portion of the log is I'd turn off the netxfer for the moment, as it generates a lot of output making the rest a bit hard to see. > pasted below - hope this is useful, it doesn't reveal anything at all to me. I > can't see any reference to the proxy - so is Boinc simply not trying to use > the proxy? In terms of client, this is running on XP SP2. Nope, BOINC is using the proxy, but in an invisible kind of way. The proxy's details do not get written to the log file, only the destination details are logged using this flag. Ok, here's a bit of explanation of what's happening: Client sends a request to the scheduler (via your proxy): > 2005-02-17 12:44:23 [climateprediction.net] Sending request to scheduler: > http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi Here's the HTML that was written in an HTML POST format: > 2005-02-17 12:44:25 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request > header: POST http://climateapps2.oucs.ox.ac.uk:80/cpdnboinc_cgi/cgi HTTP/1.0 > 2005-02-17 12:44:25 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request > header: Pragma: no-cache > 2005-02-17 12:44:25 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request > header: Cache-Control: no-cache > 2005-02-17 12:44:25 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request > header: Host: climateapps2.oucs.ox.ac.uk:80 > 2005-02-17 12:44:25 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request > header: Connection: close > 2005-02-17 12:44:25 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request > header: Content-Type: application/octet-stream > 2005-02-17 12:44:25 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request > header: Content-Length: 5638 > 2005-02-17 12:44:25 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request > header: > 2005-02-17 12:44:25 [DEBUG_HTTP ] HTTP_OP_SET::poll(): finished > sending request body And here's the reply from the scheduler: > 2005-02-17 12:44:27 [DEBUG_HTTP ] > HTTP_REPLY_HEADER::read_reply(): header: HTTP/1.0 200 OK > 2005-02-17 12:44:27 [DEBUG_HTTP ] > HTTP_REPLY_HEADER::read_reply(): header: Date: Thu, 17 Feb 2005 12:42:13 GMT > 2005-02-17 12:44:27 [DEBUG_HTTP ] > HTTP_REPLY_HEADER::read_reply(): header: Server: Apache/2.0.50 (Unix) > PHP/4.3.8 > 2005-02-17 12:44:27 [DEBUG_HTTP ] > HTTP_REPLY_HEADER::read_reply(): header: Content-Type: text/plain > 2005-02-17 12:44:27 [DEBUG_HTTP ] > HTTP_REPLY_HEADER::read_reply(): header: Content-Length: 799 > 2005-02-17 12:44:27 [DEBUG_HTTP ] > HTTP_REPLY_HEADER::read_reply(): header: Up to this point, I can confirm that your client is indeed talking to the scheduler via your proxy and is receiving a reply. We'd need to see more data from your log file to understand where the failure is occuring. Thx |
Send message Joined: 6 Oct 04 Posts: 5 Credit: 121,881 RAC: 0 |
> We'd need to see more data from your log file to understand where the failure > is occuring. Which data do you want to see, from which log file? I'm not going to be on site again until Monday unfortunately, so I'll send you the data as soon as I'm back in the office... Here's the failure message as in stderr.txt: 2005-02-17 12:43:50 [climateprediction.net] Scheduler RPC to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi failed 2005-02-17 12:43:50 [climateprediction.net] No schedulers responded 2005-02-17 12:43:50 [climateprediction.net] Deferring communication with project for 1 minutes and 0 seconds 2005-02-17 12:44:27 [climateprediction.net] Scheduler RPC to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi failed 2005-02-17 12:44:27 [climateprediction.net] No schedulers responded 2005-02-17 12:44:27 [climateprediction.net] Deferring communication with project for 1 minutes and 0 seconds |
Send message Joined: 2 Sep 04 Posts: 44 Credit: 372,682 RAC: 0 |
> Which data do you want to see, from which log file? Basically all the [DEBUG_HTTP ] data in the stdout.txt file from 2005-02-17 12:44:23 [climateprediction.net] Sending request to scheduler: http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi until 2005-02-17 12:44:27 [climateprediction.net] Deferring communication with project for 1 minutes and 0 seconds I'll email you privately and request that you send me the data, rather than post it all here. Thanks |
Send message Joined: 6 Oct 04 Posts: 5 Credit: 121,881 RAC: 0 |
Hi, Please could you email me again with your email address! I'm back in the office, and your email is on my PC at home! I'll send you the logs etc as requested. 4.19 is still failing to talk to your servers. The two work units are now showing "computation error" and it seems to be attempting to send back 5 zip files... Sorry/Thanks!.. Richard |
Send message Joined: 2 Sep 04 Posts: 44 Credit: 372,682 RAC: 0 |
> Please could you email me again with your email address! done. |
Send message Joined: 19 Sep 04 Posts: 3 Credit: 148,649 RAC: 0 |
Any news on this issue? I am having the exact same problem as outlined below. I've also heard that the Oxford server had some issues but has since been repaired. However, this does not seem to have fixed the "No schedulers responded" issue (at least for me!). Anyone? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Lewelma When I get this message, I Disable BOINC network access, wait 10 seconds, and re-enable access. Works for me. It might also be caused by a flood of data trying to get access to the server, at intervals determined by the BOINC on a lot of hosts. I just had another thought. Are you saying that you've been getting "No schedulers responded" for some time to trickles AND uploads? Les |
Send message Joined: 19 Sep 04 Posts: 3 Credit: 148,649 RAC: 0 |
I tried your "wait 10 seconds" solution, and am still getting the exact same errors as in message 9472. I have plenty of work left to do, but the "No schedulers responded" issue has been ongoing since installing Boinc client 4.19. I've checked the proxy and it's correct. How do I check between the trickles and uploads? Thanks! |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
A trickle is a small, 91 bytes file, that gets created every 10 hours on my p4 3.2G computer. On yours it could be 11 or 12 hours. They are stored in the climtaeprediction.net directory. So, if your model is still running but not 'trickling', you should have accumulated a few by now. And does BOINC say Running in the work tab? While trickles are also uploads, I really meant the big 7Meg upload at the end of 72 trickles. This is a set of 5 zipped files, which contain the results of the computation for a model. These 5 zips are in the same directory as trickles. The last trickle on your Account data page is for: 23 Feb 2005 at 13:39:18GMT. Was this about when you installed BOINC 4.19? This version is supposed to be reliable, but something may have gone wrong during installion. Anything informative in the stdout, or stderr files? (In the main BOINC folder.) Les |
Send message Joined: 30 Aug 04 Posts: 142 Credit: 9,936,132 RAC: 0 |
If you have messages like deferring communications for... In the projects tab: Select the project you want Right-click Select update You may also want to look at a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=1887">Thyme Lawn's thread about scheduler limits</a>. Forum search Site search |
Send message Joined: 2 Sep 04 Posts: 44 Credit: 372,682 RAC: 0 |
Hi Lewelma > Any news on this issue? I am having the exact same problem as outlined below. Apologies for the delay. Yes, the problem has been found and corrected in the current BOINC client development branch. The BOINC team suggest that if you are suffering from proxy related problems that you use the later version of the software: [Quote from <a href="http://boinc.berkeley.edu/download.php">Download BOINC client software</a>] Version 4.19 (released 25 Jan 2005) This version doesn't work with some HTTP proxies. If you use a proxy and experience problems, please use <a href="http://boinc.berkeley.edu/download.php?dev=1#dev">version 4.23</a>, which fixes this problem. [End Quote/] Actually v4.23 has been replaced with v4.24 at the moment. > I've also heard that the Oxford server had some issues but has since been > repaired. However, this does not seem to have fixed the "No schedulers > responded" issue (at least for me!). Yes, that's fixed from v4.23 onwards. :) Please remember that these later versions are a bit different from what you may be used to, and being Alpha software, they could possibly introduce other problems. In any event, it looks like the new versions may be released to the public quite soon. Fingers crossed. |
Send message Joined: 5 Aug 04 Posts: 1283 Credit: 15,824,334 RAC: 0 |
> Please remember that these later versions are a bit different from what you > may be used to, and being Alpha software, they could possibly introduce other > problems. You're going to have big problems if you upgrade to a development version as they require a signature on all the downloaded files. CPDN doesn't have a signature on the 3 hadsm3*.zip files, and BOINC will reject them. This will continue to be the case until the server side software is rebuilt using the current BOINC development source. See <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=2011">this thread</a>. "The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer |
Send message Joined: 2 Sep 04 Posts: 44 Credit: 372,682 RAC: 0 |
> You're going to have big problems if you upgrade to a development version as > they require a signature on all the downloaded files. CPDN doesn't have a > signature on the 3 hadsm3*.zip files, and BOINC will reject them. This will > continue to be the case until the server side software is rebuilt using the > current BOINC development source. Thanks Thyme Lawn You are right about the downloads. If you're busy with a model and need to trickle up, the development version could work for you. If you're nearing the end of a model, you should probably let the model complete and suspend network activity until the CPDN server software has been updated. Downloading a new model (or work from most BOINC projects atm.) will fail, as explained in the thread noted by Thyme. Hope this doesn't confuse everyone too much. :-/ |
Send message Joined: 19 Sep 04 Posts: 3 Credit: 148,649 RAC: 0 |
So, the official word "update to version 4.24" to fix the trickle up problem, and then wait until a new "official" version is released for new (downloaded) work. Interestingly, I attempted to add the new Einstein project to BOINC, and it was unable to download any work, which jives with the "Downloading a new model will fail" expectation of the current Boinc 4.19. I've put that one on hold for the time being. I have since updated to 4.24, and everything is uploading, and it appears that I'm getting data for the Einstein thing as well. Looks like I'm fixed! I will keep an eye out for the "official" release of the latest BOINC software and keep up with it. Thanks for your help! |
©2024 cpdn.org