Message boards : Number crunching : Upload failures
Message board moderation
Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · 15 · 16 . . . 19 · Next
Author | Message |
---|---|
Send message Joined: 30 Mar 10 Posts: 12 Credit: 2,609,109 RAC: 87 |
A few tasks are successfully uploaded :) but not all of them. Patience is a virtue |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,363,583 RAC: 5,022 |
[quote][quote]Never seen a problem from suspending tasks if BOINC isn't stopped and restarted. Also a long time since even doing that I have lost a Windows task. I have. It’s rare, but, I have occasionally had a WU’s crash after going through the suspend, wait a minute, exit the Boinc manager process. FINALLY! My zips are uploading> |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Bernard said: I suggested that more channels are used to spread the message but I haven't seen elsewhere - so no one listened. This has been talked about, although not recently. No way I'm going to touch any "social media" stuff, and the BOINC Notices system will only get picked up by people who've not turned them off, and who bother to look at BOINC anyway. (Do they even appear in the simple view?) My post was intended as a "heads up" to those that read this board. And I meant by it, to stop running models so as to not accumulate lots of zips that would have to join the fight to get back to the server after things got fixed. People that crash hundreds of tasks without even wondering why they aren't getting any credit don't count. I haven't asked for an update on the jasmin situation, because I know that the Profs and Drs are busy with other matters. But I guess I should before the weekend sets in. (In my case, it looks like being a wet one.) |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 31,050,534 RAC: 14,857 |
Or check the event log in the tools tab for the file transfer and you will get something like: 04/07/2019 22:15:49 | climateprediction.net | Started upload of wah2_safr50_n06x_201112_13_818_011860496_1_r430858283_5.zip 04/07/2019 22:15:49 | climateprediction.net | [file_xfer] URL: http://jasmin-upload.cpdn.org/cgi-bin/file_upload_handler 04/07/2019 22:15:50 | climateprediction.net | [http] [ID#100] Info: Trying 192.171.139.103... Patiently waiting for uploads to get going again! |
Send message Joined: 5 Jun 09 Posts: 97 Credit: 3,736,855 RAC: 4,073 |
No way I'm going to touch any "social media" stuff, and the BOINC Notices system will only get picked up by people who've not turned them off, and who bother to look at BOINC anyway. (Do they even appear in the simple view?) The answer to your question is "sort of" - the notices button gains a red border when you hover over it, not very obvious, and not every user will even bother with looking at the simple view to see that. And last night something stirred in the land of Jasmine - the mountain of my backed up zip files were transferred, stating round about 10pm BST. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Thanks for the bit about the Notices.. Glad to hear that about your zips. I've only just sent an email asking for an update, and I said that people were still having trouble getting uploads to go, so I guess this is where it tries to make a liar out of me. |
Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,620,508 RAC: 4,981 |
Thanks Les, I do appreciate all the efforts of people posting here and especially all of the moderators. However I really think project people should be more pro-active especially in times of trouble and we shouldn't plea for updates. It is just few lines that need to be written in an e-mail, not on a type machine and send via horse power. I did check CPDN twitter recently (during the upload failures) and all I saw was info on the new OpenIFS project and how thousands of WU were sent to crunchers. Well they could've also sent an alert, or post on the CPDN web, and use BOINC. Of course some people will never get the message, but the idea is to try to reach as many as possible. It is great queues are clearing up, let's hope all goes well during the summer. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
OK, update: The issue is one of bandwidth onto the system. We are constantly running 146 upload processes at the moment and have been since I announced on the board that we were restarting jasmin. This is one of the issues around backing up when not running. We are investigating how we can setup another upload server within the infrastructure that can be run as part of a load balancing pair behind the jasmin-upload name. David |
Send message Joined: 14 Aug 06 Posts: 22 Credit: 6,518,416 RAC: 9,956 |
Confirm please: You want us to suspend those CPDN tasks identified as safr 50 and sam 50. That is obviously easy to do on BOINC by simply hitting the SUSPEND button. However, that does not stop the transfer actions on those tasks that are already completed and in upload status. Regrets but this almost 88 year old brain is not understanding while trying to comply with instructions. Bill |
Send message Joined: 8 Jul 05 Posts: 33 Credit: 1,274,211 RAC: 0 |
It seems to be working now I'm sure Les will clarify but my backlog cleared. |
Send message Joined: 17 Aug 05 Posts: 22 Credit: 16,057,688 RAC: 15,434 |
Billy Ewell: You can suspend network activity but only in Boinc Managers advanced view I think - I don't think there is that option in simple view. But no communication is possible while network is suspended. I've suspended my uploads until further notice, but try now and then to see if I can upload. So far no luck here - I've got a few uploads waiting, but no running work units now. Think I will wait the weekend over. |
Send message Joined: 28 Nov 15 Posts: 50 Credit: 4,099,809 RAC: 0 |
I have had stuck uploads for the better part of a week from these 2 types of wah 5/07/2019 7:50:05 a.m. | climateprediction.net | Started upload of wah2_sam50_a4z3_201112_24_815_011854387_0_r799897574_20.zip 5/07/2019 7:55:13 a.m. | climateprediction.net | Temporarily failed upload of wah2_sam50_a4z3_201112_24_815_011854387_0_r799897574_20.zip: transient HTTP error 5/07/2019 7:50:05 a.m. | climateprediction.net | Started upload of wah2_safr50_n7cq_201512_13_820_011874340_0_r1435939452_8.zip 5/07/2019 7:55:13 a.m. | climateprediction.net | Temporarily failed upload of wah2_safr50_n7cq_201512_13_820_011874340_0_r1435939452_8.zip: transient HTTP error I have more trickles of the same series also stuck on another computer. I hope they can fix the servers soon, I am running out of space to store unsent trickles. Kind regards Vicki. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Vicki First of all, they're zip files, not trickles. (Which are RPCs, and don't show up in the Transfers tab.) Second, if you're running out of space then Suspend either each model in the Tasks tab, or the project in the Projects tab. I said this a week ago. Then wait for however long it takes to clear. There are posts from the project's technical manager in several places in this thread explaining the situation. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Billy The reason for Suspending the models, is so that they don't keep creating zips, and therefore making more problems for you the individual. Leaving the computer connected to the internet all the time, will allow what zips you already have to eventually upload. Which may be another week. Or two. Who knows. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
KWSN Sir Clark Thanks for that. Nice to have some good news. |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
All of my sam50 zips (around a hundred) have gone. But I have a couple of cam25 zips that have been hanging for over a week. They have been retried 50 to 62 times. Do they go to a different server? |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,039,635 RAC: 18,944 |
All of my sam50 zips (around a hundred) have gone. But I have a couple of cam25 zips that have been hanging for over a week. They have been retried 50 to 62 times. Yes, in Mexico I believe. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Hi Jim I had 3 cam25s a year or 2 back. One uploaded OK, one repeated zips 3 and 6 each time a new zip was created, and the third did the same, but I've forgotten how many it had trouble with. In the end I just Aborted those two after all zips were created and giving a chance to upload. I don't know what it is with those cams, but I wouldn't waste too much time on them. Let someone else have a try. And it's good to hear that your others have gone. That's two people with good news. |
Send message Joined: 13 Jan 07 Posts: 195 Credit: 10,581,566 RAC: 0 |
Like other posters, my zips for SAM50 and SAFR50 models have all cleared. But one ZIP for a CAM25 model repeatedly sticks at 13.47 percent of the upload. I have a folk memory that we have had this issue before but cannot remember whether there was a solution or whether we all just aborted them as Les suggests. |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
Thanks for the history on the cam25's. I found the one in question, and it has returned 18 zips. https://www.cpdn.org/cpdnboinc/result.php?resultid=21709022 However, the ones that are stuck are #12 and #13. So it looks like they got lost in the shuffle. If they have not uploaded by the time my other work has finished tomorrow, I will just can them (as in trash can; I just realized that may not be clear to non-native English speakers). |
©2024 cpdn.org