climateprediction.net (CPDN) home page
Thread 'Batch 996 Weather@Home2 East Asia25'

Thread 'Batch 996 Weather@Home2 East Asia25'

Message boards : Number crunching : Batch 996 Weather@Home2 East Asia25
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 12 · Next

AuthorMessage
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,808,726
RAC: 5,192
Message 69831 - Posted: 13 Oct 2023, 15:59:56 UTC

Limiting the upload speed to 150 kBps (Options | Computing preferences... | Network) has worked for my stuck uploads. Having failed any upload for a week or so and lost one model trying other fixes, they're now all uploading, slowly. I might try to play with the limit to increase it, if possible.

Thanks to those who suggested that approach.

Back in business!
ID: 69831 · Report as offensive     Reply Quote
bibi

Send message
Joined: 22 Dec 08
Posts: 7
Credit: 21,869,243
RAC: 28,113
Message 69832 - Posted: 13 Oct 2023, 16:42:11 UTC
Last modified: 13 Oct 2023, 17:34:36 UTC

What a torture. The limit seems to be 100 KBps/IP. So I configured the three PCs to 33 KBps each.

Now all uploads are comming to there end.
ID: 69832 · Report as offensive     Reply Quote
ChelseaOilman

Send message
Joined: 24 Dec 19
Posts: 32
Credit: 40,991,754
RAC: 77,248
Message 69833 - Posted: 13 Oct 2023, 16:47:24 UTC

Of the retry zip and zip files waiting and unable to upload, are these all trickle files? Or are some finished tasks? How do I tell?
Clearly some of my trickle files are getting through. as I'm earning credits. It would be a shame if your not getting the finished tasks.
ID: 69833 · Report as offensive     Reply Quote
ChelseaOilman

Send message
Joined: 24 Dec 19
Posts: 32
Credit: 40,991,754
RAC: 77,248
Message 69834 - Posted: 13 Oct 2023, 16:52:42 UTC - in response to Message 69832.  

What a torture. The limit seems to be 100 KBps/IP. So I configured the three PCs to 33 KBps each.

I just tried limiting upload speed to 50KB/second on my Dell XPS 15 and no difference. The 2 files waiting to upload won't.
ID: 69834 · Report as offensive     Reply Quote
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,808,726
RAC: 5,192
Message 69835 - Posted: 13 Oct 2023, 16:55:34 UTC - in response to Message 69833.  

Of the retry zip and zip files waiting and unable to upload, are these all trickle files? Or are some finished tasks? How do I tell?
Clearly some of my trickle files are getting through. as I'm earning credits. It would be a shame if your not getting the finished tasks.


The trickle files, which cause credits to be allocated are small files that have been uploading for everyone (I think).

The larger Zip files have the useful data and they are sometimes having problems.
ID: 69835 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,703,308
RAC: 9,860
Message 69836 - Posted: 13 Oct 2023, 17:01:32 UTC - in response to Message 69833.  
Last modified: 13 Oct 2023, 17:02:32 UTC

You need to be aware that "trickles" are different from "uploads". A trickle is a small administrative progress report, sent direct to CPDN HQ in Oxford, England. An upload is a huge wodge of climate research data, sent to the scientific research hub in the region under investigation - Korea, in this case.

Credit is awarded on the basis of the trickles received - a measure of the amount of effort your computer has put into the work so far. It isn't conditional on the successful transfer of the results - though this one of mine did get through.

13/10/2023 02:16:00 | climateprediction.net | Sending scheduler request: To send trickle-up message.
13/10/2023 02:16:01 | climateprediction.net | Scheduler request completed
13/10/2023 02:16:11 | climateprediction.net | Started upload of wah2_eas25_a02o_198512_24_996_012223644_0_r1306186109_23.zip
13/10/2023 02:19:02 | climateprediction.net | Finished upload of wah2_eas25_a02o_198512_24_996_012223644_0_r1306186109_23.zip (99123610 bytes)
You can see that the two different reports were sent at very close to the same time, but were handled differently.
ID: 69836 · Report as offensive     Reply Quote
ChelseaOilman

Send message
Joined: 24 Dec 19
Posts: 32
Credit: 40,991,754
RAC: 77,248
Message 69839 - Posted: 13 Oct 2023, 18:19:44 UTC - in response to Message 69836.  

An upload is a huge wodge of climate research data, sent to the scientific research hub in the region under investigation - Korea, in this case.

Credits are fine but I'm more interested in not wasting expensive electricity spinning my wheels for nothing. You would think Korea would be more interested in fixing this issue as they won't be getting the results if they don't.
ID: 69839 · Report as offensive     Reply Quote
Tomcat

Send message
Joined: 29 May 15
Posts: 17
Credit: 717,192
RAC: 12,206
Message 69842 - Posted: 13 Oct 2023, 21:19:39 UTC - in response to Message 69804.  

We can raise the limit to allow for transfers still waiting to go. As Dave says, it's a limit set by CPDN on how much disk the task is expected to use. It's normally set with a decent margin but I suspect it hasn't been changed for some time and the regions for WAH2 have got bigger.


That seems like a good idea, thanks. Having the limit be large enough to accommodate for all uploads getting suck might be helpful.

On a side note, looks like both my wingmen failed the workunit for other reasons. That means there won't be any scientific results for this workunit despite a total of almost 8 days of crunching. It's a pity, really.
ID: 69842 · Report as offensive     Reply Quote
Tomcat

Send message
Joined: 29 May 15
Posts: 17
Credit: 717,192
RAC: 12,206
Message 69843 - Posted: 13 Oct 2023, 21:29:40 UTC
Last modified: 13 Oct 2023, 21:39:25 UTC

Update: My uploads are now finally working. That is after limiting my upload speed to 100kbps and waiting for the timer to tick down. Manually clicking the retry button doesn't seem to work at all.

I have about 6 dozen zips waiting to be uploaded (would have been 7 dozen counting the failed task) so I raised the limit to 150kbps. The zips are now slowly trickling away at 60kbps per file so it's going to take a while.

I recently upgraded my PC and Internet from 50Mbps phone cable to gigabit optic fibre, I was not expecting that to cause problems.
ID: 69843 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,432,494
RAC: 17,331
Message 69855 - Posted: 14 Oct 2023, 14:38:03 UTC - in response to Message 69843.  

I've been watching my uploads and most are going between 90-120kbps, I've not changed my max limit to 150, I suspect it's not going to make any difference if I do. I've got one waiting on its 42nd retry, but most go through on less than 10.

I'll find out what they have set the max no. of concurrent connections to on the Korean side for the upload server.
ID: 69855 · Report as offensive     Reply Quote
Tomcat

Send message
Joined: 29 May 15
Posts: 17
Credit: 717,192
RAC: 12,206
Message 69857 - Posted: 14 Oct 2023, 14:54:10 UTC
Last modified: 14 Oct 2023, 14:56:04 UTC

Update:

After changing my upload speed, my failed uploads are now at 6 down from 72 (it wasn't 5 dozen, I actually counted this time). It was around 100 before I lost a bunch from the failed task. I don't know if it's me limiting the upload speed or them fixing the server.

The remaining 6 are giving me this error:
2023-10-14 10:32:11 AM | climateprediction.net | Started upload of wah2_eas25_a0nm_198912_24_996_012224398_0_r372338944_23.zip
2023-10-14 10:32:11 AM | climateprediction.net | Started upload of wah2_eas25_a0uu_199012_24_996_012224658_2_r1702623904_21.zip
2023-10-14 10:32:13 AM | climateprediction.net | [error] Error reported by file upload server: [wah2_eas25_a0nm_198912_24_996_012224398_0_r372338944_23.zip] locked by file_upload_handler PID=61512
2023-10-14 10:32:13 AM | climateprediction.net | Temporarily failed upload of wah2_eas25_a0nm_198912_24_996_012224398_0_r372338944_23.zip: transient upload error
2023-10-14 10:32:13 AM | climateprediction.net | Backing off 00:12:51 on upload of wah2_eas25_a0nm_198912_24_996_012224398_0_r372338944_23.zip
2023-10-14 10:39:31 AM | climateprediction.net | Temporarily failed upload of wah2_eas25_a0uu_199012_24_996_012224658_2_r1702623904_21.zip: transient HTTP error
2023-10-14 10:39:31 AM | climateprediction.net | Backing off 00:12:45 on upload of wah2_eas25_a0uu_199012_24_996_012224658_2_r1702623904_21.zip
2023-10-14 10:39:32 AM |  | Project communication failed: attempting access to reference site
2023-10-14 10:39:33 AM |  | Internet access OK - project servers may be temporarily down.
2023-10-14 10:50:56 AM | climateprediction.net | Started upload of wah2_eas25_a0nm_198912_24_996_012224398_0_r372338944_23.zip
2023-10-14 10:50:57 AM | climateprediction.net | [error] Error reported by file upload server: [wah2_eas25_a0nm_198912_24_996_012224398_0_r372338944_23.zip] locked by file_upload_handler PID=61512
2023-10-14 10:50:57 AM | climateprediction.net | Temporarily failed upload of wah2_eas25_a0nm_198912_24_996_012224398_0_r372338944_23.zip: transient upload error
2023-10-14 10:50:57 AM | climateprediction.net | Backing off 00:26:15 on upload of wah2_eas25_a0nm_198912_24_996_012224398_0_r372338944_23.zip


"locked by file_upload_handler PID=61512", eh? Should I suspend all uploads for a day?
ID: 69857 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,008,987
RAC: 21,524
Message 69860 - Posted: 14 Oct 2023, 15:47:43 UTC

I've not changed my max limit to 150,
That is one and a half times the maximum I can get to start with. Only time I restrict mine is when other half has a zoom meeting while tasks are uploading!
ID: 69860 · Report as offensive     Reply Quote
flensr

Send message
Joined: 17 Oct 18
Posts: 8
Credit: 1,728,357
RAC: 3,964
Message 69862 - Posted: 15 Oct 2023, 4:20:41 UTC - in response to Message 69711.  
Last modified: 15 Oct 2023, 4:22:45 UTC

After 6 of my 996 tasks aborted due to errors related to a computer reboot, I got 2 to finish. But the zips won't upload.

10/14/2023 21:21:39 | climateprediction.net | Started upload of ah2_eas25_a3o5_200812_24_996_012228305_0_r976356106_19.zip
10/14/2023 21:22:02 | climateprediction.net | Temporarily failed upload of wah2_eas25_a3o5_200812_24_996_012228305_0_r976356106_19.zip: transient HTTP error
10/14/2023 21:22:02 | climateprediction.net | Backing off 04:04:42 on upload of wah2_eas25_a3o5_200812_24_996_012228305_0_r976356106_19.zip
10/14/2023 21:22:03 | | Project communication failed: attempting access to reference site
10/14/2023 21:22:04 | | Internet access OK - project servers may be temporarily down.
ID: 69862 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,008,987
RAC: 21,524
Message 69863 - Posted: 15 Oct 2023, 8:03:15 UTC - in response to Message 69862.  

After 6 of my 996 tasks aborted due to errors related to a computer reboot, I got 2 to finish. But the zips won't upload.
Try limiting your upload speed. It seems to have helped with some here.
ID: 69863 · Report as offensive     Reply Quote
rob

Send message
Joined: 5 Jun 09
Posts: 97
Credit: 3,736,855
RAC: 4,073
Message 69866 - Posted: 15 Oct 2023, 14:39:58 UTC

A question (which has probably been asked a few times already)
Does it matter if a low number zip file is stuck in the can't upload cycle, but higher number zips for the same task have escaped the cycle and have been safely uploaded?
ID: 69866 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,008,987
RAC: 21,524
Message 69867 - Posted: 15 Oct 2023, 15:34:33 UTC - in response to Message 69866.  

It matters in the sense that the task will never report and show as completed on the server if any of the zips fail to upload. So long as they upload eventually, the order in which they make it through makes no difference. My first tasks of this batch should complete in the early hours of tomorrow morning. (UK time) Given that mine all seem to be uploading without issue, I expect to see four completed and reported when I get up.
ID: 69867 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,008,987
RAC: 21,524
Message 69868 - Posted: 15 Oct 2023, 15:36:24 UTC

I don't know if it's me limiting the upload speed or them fixing the server.
Nothing has changed on the server at the moment but whether it is you limiting the upload speed or random chance is still open to debate.
ID: 69868 · Report as offensive     Reply Quote
flensr

Send message
Joined: 17 Oct 18
Posts: 8
Credit: 1,728,357
RAC: 3,964
Message 69870 - Posted: 15 Oct 2023, 18:07:06 UTC - in response to Message 69863.  

I've tried upload speeds of 30 100 and 150 and it hasn't helped. I'm probably going to abort the transfer and limit my project participation to no more than 1 work unit at a time so I don't waste so much computer time on these in the future. I had 8 going this last week, 6 couldn't handle a reboot and the 2 that finished won't upload. Waste of time and energy at this point, it's just too fragile. That's sad because I think this is one of the more worthwhile projects out there and up until now I had given CPDN the absolute highest priority on my computers.
ID: 69870 · Report as offensive     Reply Quote
ChelseaOilman

Send message
Joined: 24 Dec 19
Posts: 32
Credit: 40,991,754
RAC: 77,248
Message 69872 - Posted: 15 Oct 2023, 18:45:42 UTC
Last modified: 15 Oct 2023, 18:46:12 UTC

Limiting max number of uploads to 1 and limiting upload speeds to 100KB has done nothing for me.

Now I see a 927KB file ending in out.zip on 1 of my computers going nowhere. Can I assume that's a trickle up file?
ID: 69872 · Report as offensive     Reply Quote
Tomcat

Send message
Joined: 29 May 15
Posts: 17
Credit: 717,192
RAC: 12,206
Message 69873 - Posted: 15 Oct 2023, 19:25:36 UTC
Last modified: 15 Oct 2023, 19:30:52 UTC

I'm still at 6 trickle files stuck. A few new trickle and "_out" files managed to upload after my previous update.


2023-10-15 2:43:29 PM | climateprediction.net | Started upload of wah2_eas25_a4of_201512_24_996_012229611_1_r299430643_19.zip
2023-10-15 2:43:31 PM | climateprediction.net | [error] Error reported by file upload server: [wah2_eas25_a4of_201512_24_996_012229611_1_r299430643_19.zip] locked by file_upload_handler PID=172690
2023-10-15 2:43:31 PM | climateprediction.net | Temporarily failed upload of wah2_eas25_a4of_201512_24_996_012229611_1_r299430643_19.zip: transient upload error
2023-10-15 2:43:31 PM | climateprediction.net | Backing off 00:04:39 on upload of wah2_eas25_a4of_201512_24_996_012229611_1_r299430643_19.zip
2023-10-15 2:54:27 PM | climateprediction.net | Started upload of wah2_eas25_a4of_201512_24_996_012229611_1_r299430643_19.zip
2023-10-15 2:54:28 PM | climateprediction.net | [error] Error reported by file upload server: [wah2_eas25_a4of_201512_24_996_012229611_1_r299430643_19.zip] locked by file_upload_handler PID=172690
2023-10-15 2:54:28 PM | climateprediction.net | Temporarily failed upload of wah2_eas25_a4of_201512_24_996_012229611_1_r299430643_19.zip: transient upload error
2023-10-15 2:54:28 PM | climateprediction.net | Backing off 00:11:22 on upload of wah2_eas25_a4of_201512_24_996_012229611_1_r299430643_19.zip
2023-10-15 3:06:31 PM | climateprediction.net | Sending scheduler request: To send trickle-up message.
2023-10-15 3:06:31 PM | climateprediction.net | Not requesting tasks: "no new tasks" requested via Manager
2023-10-15 3:06:32 PM | climateprediction.net | Scheduler request completed
2023-10-15 3:06:32 PM | climateprediction.net | Project requested delay of 3636 seconds
2023-10-15 3:10:58 PM | climateprediction.net | Started upload of wah2_eas25_a0nm_198912_24_996_012224398_0_r372338944_19.zip
2023-10-15 3:10:58 PM | climateprediction.net | Started upload of wah2_eas25_a4of_201512_24_996_012229611_1_r299430643_19.zip
2023-10-15 3:11:00 PM | climateprediction.net | [error] Error reported by file upload server: [wah2_eas25_a4of_201512_24_996_012229611_1_r299430643_19.zip] locked by file_upload_handler PID=172690
2023-10-15 3:11:00 PM | climateprediction.net | Temporarily failed upload of wah2_eas25_a4of_201512_24_996_012229611_1_r299430643_19.zip: transient upload error
2023-10-15 3:11:00 PM | climateprediction.net | Backing off 00:29:35 on upload of wah2_eas25_a4of_201512_24_996_012229611_1_r299430643_19.zip
2023-10-15 3:11:23 PM | climateprediction.net | Temporarily failed upload of wah2_eas25_a0nm_198912_24_996_012224398_0_r372338944_19.zip: transient HTTP error
2023-10-15 3:11:23 PM | climateprediction.net | Backing off 03:01:48 on upload of wah2_eas25_a0nm_198912_24_996_012224398_0_r372338944_19.zip
2023-10-15 3:11:24 PM |  | Project communication failed: attempting access to reference site
2023-10-15 3:11:25 PM |  | Internet access OK - project servers may be temporarily down.
2023-10-15 3:17:36 PM |  | Suspending network activity - user request


"[error] Error reported by file upload server: [wah2_eas25_a4of_201512_24_996_012229611_1_r299430643_19.zip] locked by file_upload_handler PID=172690" What is this?
ID: 69873 · Report as offensive     Reply Quote
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 12 · Next

Message boards : Number crunching : Batch 996 Weather@Home2 East Asia25

©2024 cpdn.org