Message boards : Number crunching : Batch 996 Weather@Home2 East Asia25
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 12 · Next
Author | Message |
---|---|
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,432,494 RAC: 17,331 |
I did that after I read your message but didn't find anything? Heard back from Andy he's never seen that before. @Glenn --- CPDN Visiting Scientist |
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,703,308 RAC: 9,860 |
Thread 7592, specifically message 46161? |
Send message Joined: 29 Oct 17 Posts: 1049 Credit: 16,432,494 RAC: 17,331 |
Yes, I saw that one. But all it tells me is there's something wrong with writing files to a device. I/O should be buffered normally but there are places in the code where it tries to force a flush. But even that is only a hint to the OS which can choose to ignore it. If it was me getting those errors, I'd check the device health as a first step. Thread 7592, specifically message 46161? |
Send message Joined: 2 Oct 06 Posts: 54 Credit: 27,309,613 RAC: 28,128 |
FWIW, I have 7 zips that cannot upload. "transient HTTP error"Andy's just informed me that he's restarted the httpd server on the Korean machine. It was running & not out of space, but rather alot of uploads and most likely stale connections. Hope that's got stuck uploads moving again. I have 9 now stuck. I just now tried to upload, with no success: 1182081 climateprediction.net 10/10/2023 9:08:03 AM Started upload of wah2_eas25_a2h5_200112_24_996_012226757_2_r1545933914_8.zip 1182214 climateprediction.net 10/10/2023 9:08:52 AM Temporarily failed upload of wah2_eas25_a2h5_200112_24_996_012226757_2_r1545933914_8.zip: transient HTTP error 1182215 climateprediction.net 10/10/2023 9:08:52 AM Backing off 05:00:54 on upload of wah2_eas25_a2h5_200112_24_996_012226757_2_r1545933914_8.zip 1182216 10/10/2023 9:08:53 AM Project communication failed: attempting access to reference site 1182217 10/10/2023 9:08:54 AM Internet access OK - project servers may be temporarily down. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,011,472 RAC: 21,368 |
I have let the project know along with the crucial extract from an event log. See this post. |
Send message Joined: 5 Jun 09 Posts: 97 Credit: 3,736,855 RAC: 4,073 |
Thanks - I'll put that idea in the red-herring bin |
Send message Joined: 24 Dec 19 Posts: 32 Credit: 40,993,432 RAC: 76,861 |
[quote][quote]I have 9 now stuck. I just now tried to upload, with no success: I feel your pain. I've resigned myself to the reality this issue isn't going to get fixed. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,011,472 RAC: 21,368 |
I have 9 now stuck. I just now tried to upload, with no success:What I would really like to understand is why some don't seem to have any problems. Is it just random or is there a pattern that neither I nor anyone else are seeing? Edit: My message has been passed from the researcher to those maintaining the server. Unfortunately, this issue may be so esoteric that it might not help much. |
Send message Joined: 22 Dec 08 Posts: 7 Credit: 21,869,243 RAC: 28,113 |
From germany. I make them count: $ for host in r r2 r5 pc; do echo "$host $(bnc $host --get_file_transfers | grep -c wah2)"; done r 48 r2 42 r5 38 pc 3 Sometimes I see a few uploads, but mostly stuck while uploading. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,011,472 RAC: 21,368 |
From the researcher Hi Dave, a IT staff told me there isn’t any change in bandwidth settings (port 50000-51000) for Korean server , so it should be (physically) open for any user as usual. Since he wants to investigate further, could you provide me more information? I will hand it over to him. *Uploader information having an issue: e.g., its IP address, a upload date, an intended port number, etc. |
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,703,308 RAC: 9,860 |
See my comments in the Windows upload thread. |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,011,472 RAC: 21,368 |
I have posted your further comments on the Trello board. Richard. My own uploads are all still getting through with only the occasional transient http error from a bored band connection maxing out at about 120Kb/second with a following wind. |
Send message Joined: 2 Oct 06 Posts: 54 Credit: 27,309,613 RAC: 28,128 |
In addition to 12 zip files, I now have a completed task that cannot upload. |
Send message Joined: 5 Aug 04 Posts: 126 Credit: 24,437,617 RAC: 23,687 |
Since I apparently overlooked a Windows 10 update, 15 tasks crapped out after the unexpected re-boot. 14 errored-out with "Signal 11 received: Segment violation" but one of them strangely enough also had "The system cannot find the drive specified. (0xf) - exit code 15 (0xf)" One of them had "The access code is invalid. (0xc) - exit code 12 (0xc)" All of them had at least 1 trickle, meaning it's not wu's that errored-out at the initial startup. |
Send message Joined: 9 Dec 05 Posts: 116 Credit: 12,547,934 RAC: 2,738 |
I've experienced an unexpected power failure today. My two hosts were both running an EAS task while this happened. Both seem to have survived the abnormal shutdown and are now crunching ahead. The win10 host task survived also the patch Tuesday restart earlier this morning. |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 30,980,040 RAC: 14,224 |
Lost 9 of my 12 tasks following a "planned" reboot. 3 unexplained but the rest all sig 11 seg violation. One resend picked up this morning also failed sig 11 seg violation. |
Send message Joined: 24 Dec 19 Posts: 32 Credit: 40,993,432 RAC: 76,861 |
What are these restart zip files I'm seeing? |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
Well all three of my tasks crashed after uploading 10 trickles each. My machine got another task and it crashed after uploading a single trickle. I cannot tell what really went wrong with any of them. My machine is Computer ID 1512658, and the tasks were: 22340449 22339081 22339022 22346116 |
Send message Joined: 15 May 09 Posts: 4540 Credit: 19,011,472 RAC: 21,368 |
What are these restart zip files I'm seeing?They are files generated by a lot of CPDN tasks thatI think can be used to generate further tasks. They don't however always get used. More often than not they are generated at the end of a task rather than half way through as in this and the previous batch. |
Send message Joined: 5 Jun 09 Posts: 97 Credit: 3,736,855 RAC: 4,073 |
While I was out last night another signal 11 arrived and departed. https://www.cpdn.org/result.php?resultid=22346439 At the same time three other tasks continued the long plod towards completion. Haul of failure since 5th October SIGNAL 11 = 6 (runtime ~ 2 minutes) "restart" failure/ signal 11 = 6 (runtime >3 minutes) (One of these https://www.cpdn.org/result.php?resultid=22337980 was not associated with a shutdown/restart cycle, but failed ~20 minutes after first start.) Only 3 tasks of the 15 received have any chance of reaching completion, I'll keep the PC (and BOINC) running until they have finished. |
©2024 cpdn.org