climateprediction.net (CPDN) home page
Thread 'TRICKLE CANNOT UPLOAD, BUT, SERVER SHOWS IT ALREADY HAS'

Thread 'TRICKLE CANNOT UPLOAD, BUT, SERVER SHOWS IT ALREADY HAS'

Message boards : Number crunching : TRICKLE CANNOT UPLOAD, BUT, SERVER SHOWS IT ALREADY HAS
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 41512 - Posted: 20 Jan 2011, 23:07:00 UTC
Last modified: 20 Jan 2011, 23:17:34 UTC

I have a rather strange problem. WU hadam3P_eu_xuv6_1977_1_007045210_2 (Task ID#12462018) has had a trickle stuck in my transfer tab for 2 days now. The problem is that I cannot get this trickle to upload. I think it is the first trickle. I know that other trickles for this WU have uploaded since it was created 2 days ago.

At first I put this down to all of the server problems we have been having, but, now I don’t think so. The WU is presently crunching its way through late march of 1978. It should have produced 3 trickles by now. Checking the WU under “computers” in “my account” I find that 3 trickles are listed as having been received. Does the server really have the entire trickle? If so how do I get rid of the pseudo trickle stuck in the transfer tab that keeps trying (and failing) to upload? Should I use the "abort transfer" button to get rid of it? Is this a sign that the WU is flawed and should be aborted or can it be saved?
ID: 41512 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 41513 - Posted: 21 Jan 2011, 0:14:16 UTC - in response to Message 41512.  

It may simply be due to the change of IP of all of our servers that are on the OERC network, as per the News thread.
Until someone shows up who can fix the problem, a lot of trickles and zips for various file types will be stuck in peoples Transfer queues.

As for your trickle, is it essential that you remove it from your queue?

Backups: Here
ID: 41513 · Report as offensive     Reply Quote
Profilescottishwebcamslive.com
Avatar

Send message
Joined: 21 Jun 06
Posts: 26
Credit: 8,397,236
RAC: 0
Message 41514 - Posted: 21 Jan 2011, 14:24:55 UTC - in response to Message 41513.  

Hello,

I have the same problem only not with 1 trickle but with 73 on one machine alone!
if there is an IP problem somebody needs to let us know what to do about it as me sitting here filling up my hard drive is no solution

best regards
Ian
----> Please Join team Scotland HERE
ID: 41514 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 41515 - Posted: 21 Jan 2011, 14:45:29 UTC - in response to Message 41513.  

Thanks for the reply, Les. So you think that the failure to upload of the zip files (there are now 2, 1.zip and 4.zip stuck in the transfer tab) is server related and not an inherent problem in the WU. No real need to clear the transfer tab immediately. I will just suspend network activity to keep it from trying to upload every few hours. I will keep running it and hope that the gods of IT get the server problem sorted out before the WU finishes in about a week.

The one thing that I don’t get is why does the “Trickle Information for Result # 12462018” page show the zip files as having been uploaded? Does the server have them or not???


ID: 41515 · Report as offensive     Reply Quote
3rkko

Send message
Joined: 12 Feb 08
Posts: 66
Credit: 4,877,652
RAC: 0
Message 41516 - Posted: 21 Jan 2011, 16:22:24 UTC - in response to Message 41515.  

Trickles and zip files are two different things. A trickle is just a small piece of data that lets the server know the model is still alive. A zip file is several megabytes of scientific data the model has generated. They usually go to different servers so you may be able to trickle but not upload zips.
ID: 41516 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 41519 - Posted: 21 Jan 2011, 19:11:51 UTC - in response to Message 41515.  

Jim

If the IP change occurred after your computer uploaded the trickle data, but before the server could return an Ack message, then this would produce the effect that you see.


Backups: Here
ID: 41519 · Report as offensive     Reply Quote
Profilescottishwebcamslive.com
Avatar

Send message
Joined: 21 Jun 06
Posts: 26
Credit: 8,397,236
RAC: 0
Message 41520 - Posted: 21 Jan 2011, 19:19:51 UTC - in response to Message 41516.  
Last modified: 21 Jan 2011, 19:20:46 UTC

hello,

I realize that the trickles and zip files are differant i.e. being 5.24 meg in size for instance
i now have 76 zip files equaling nearly 400 meg of zip files stored up so far and increasing steadily
something has to be done to allow these zip files to be uploaded as i dont plan on having an upload on my broadband connection of a gig or something rediculous in one go !

best regards
Ian
----> Please Join team Scotland HERE
ID: 41520 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 41521 - Posted: 21 Jan 2011, 19:26:40 UTC - in response to Message 41514.  

Ian

There are no programmers working for cpdn at present, and apparently no one else around at Oxford Uni who knows how to fix the problem.
The people who are running the current projects know about it. One was in the process of moving data off Kraken to another server when the change occurred.
This is why Kraken is still showing as off line.

As for what users can do, the only solution is to go into the Projects tab, set cpdn to No new tasks, then go into the Tasks tab, and Suspend all cpdn models.

The server for the beta test site is also affected.
This means that testers can't return data, meaning in turn, that the release of new models is going to be delayed even further.


Backups: Here
ID: 41521 · Report as offensive     Reply Quote
ProfileMilo Thurston
Volunteer moderator
Volunteer developer

Send message
Joined: 2 Mar 06
Posts: 253
Credit: 363,646
RAC: 0
Message 41523 - Posted: 22 Jan 2011, 12:58:50 UTC - in response to Message 41521.  


....and apparently no one else around at Oxford Uni who knows how to fix the problem.


I suspect that they know how to fix it, but as the IP addresses have been switched earlier than I expected it means that various machines are now not connected to the network, and the only way to fix them now is to go into the OeRC machine room. No current CPDN staff have access to this room, though the new staff will.

Meanwhile, I'll have to do it next week.
ID: 41523 · Report as offensive     Reply Quote
Nigel Garvey

Send message
Joined: 5 May 10
Posts: 69
Credit: 1,169,103
RAC: 2,258
Message 41524 - Posted: 23 Jan 2011, 11:26:57 UTC - in response to Message 41520.  

i now have 76 zip files equaling nearly 400 meg of zip files stored up so far and increasing steadily
something has to be done to allow these zip files to be uploaded as i dont plan on having an upload on my broadband connection of a gig or something rediculous in one go !


Just think of the hit the server's going to take when it's reconnected! :)

As I write, I have twenty .zips from two current FAMOUS tasks waiting for upload. (A trifle under 105 MB.) If the tasks error before the files are uploaded, the files will simply be deleted (by the BOINC client, I believe). That happened when the server was down over Christmas too. Presumably the data were lost.

NG
ID: 41524 · Report as offensive     Reply Quote
Helmer Bryd

Send message
Joined: 16 Aug 04
Posts: 156
Credit: 9,035,872
RAC: 2,928
Message 41525 - Posted: 23 Jan 2011, 11:28:06 UTC
Last modified: 23 Jan 2011, 12:02:18 UTC

You are an angel Milo :-)
ID: 41525 · Report as offensive     Reply Quote
Profilescottishwebcamslive.com
Avatar

Send message
Joined: 21 Jun 06
Posts: 26
Credit: 8,397,236
RAC: 0
Message 41527 - Posted: 23 Jan 2011, 13:53:48 UTC

hi,

I now have 206 zip files between my 2 machines equalling more than a gig of files to be uploaded so i suspect the servers even when back will be under a very severe load if not too high a load when acceping all this data

Ian
----> Please Join team Scotland HERE
ID: 41527 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 41529 - Posted: 23 Jan 2011, 19:50:45 UTC - in response to Message 41524.  

Nigel

There's a very simple solution to your worry: BACKUPS!

Make one now while all of the models are still running, and use this to get all of the zips back to Oxford if they crash near the end.
Make a new one every day. Having lots of backups isn't a crime!

It's also possible to do as I mentioned earlier, and Suspend all of the models until the servers are back.

The first step of course, is to go to the Projects tab, and set ALL projects to No new tasks, run down ALL work other than cpdn, and THEN make the backups.
Then get more work from your other projects if desired.


Backups: Here
ID: 41529 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 41530 - Posted: 23 Jan 2011, 21:04:49 UTC

As I don't run my computer 24/7 and only have 2 cores I don't have to worry about running out of space unless the zip files go for months without being able to upload.
Thanks again Milo for keeping things working in your own/your new job's time when you could have abandoned the project.

Dave
ID: 41530 · Report as offensive     Reply Quote
ProfileMilo Thurston
Volunteer moderator
Volunteer developer

Send message
Joined: 2 Mar 06
Posts: 253
Credit: 363,646
RAC: 0
Message 41531 - Posted: 24 Jan 2011, 9:09:24 UTC - in response to Message 41530.  


Thanks again Milo for keeping things working in your own/your new job's time when you could have abandoned the project.


You're welcome.
Kraken is now fixed but, as expected, heavily overloaded. So, don't be surprised if it appears to be down when you try to connect. Some people are getting through as files are arriving.
ID: 41531 · Report as offensive     Reply Quote
Nigel Garvey

Send message
Joined: 5 May 10
Posts: 69
Credit: 1,169,103
RAC: 2,258
Message 41532 - Posted: 24 Jan 2011, 9:53:47 UTC - in response to Message 41529.  
Last modified: 24 Jan 2011, 9:54:21 UTC

Thanks, Les.

I suspended Climateprediction.net last night rather than do anything which might interfere with my other projects. The files are uploading now, in no particular order, as I write. Only three and two halves to go now. I'll unsuspend when they're all gone.
ID: 41532 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 41542 - Posted: 26 Jan 2011, 19:27:53 UTC

I have a question about the Hadam3p_pnw Wu’s and the present server problems. I know that they upload 12 monthly zip files and that they go to servers at the University of Oregon. They have uploaded just fine. It is the 13 zip file that I am worried about. I know that it goes to Oxford, but, I am not sure what server it to goes to. Does it go the OeRC server that is presently offline? Should I suspend before it finishes? I don’t need any fore zip files stuck in the transfer tab.

ID: 41542 · Report as offensive     Reply Quote
Profilemo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 41544 - Posted: 27 Jan 2011, 0:49:22 UTC

I haven't got any PNW models myself so I can't check to make sure, but on the Server status page it says:

Upload server (restart dumps) climateapps1.oucs

File _13 for all the regional models is a restart dump. The _13 file for EU models goes to climateapps1.oucs. The _13 file for SA and PNW models must either go to the same server as files 1 - 12 or, much more probably, to climateapps1. I don't think the people in Oregon or Penn State Uni will want restart dumps to put together all the models in the time series, so climateapps1 is more likely.

I don't think they'd want to use uploader.oerc (the server that's still down) for any of the restart dumps because it's being used as an ordinary upload server. So I'd let the _13 PNW file upload.
Cpdn news
ID: 41544 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 41545 - Posted: 27 Jan 2011, 14:42:09 UTC - in response to Message 41544.  

Thanks, Mo. I will let it run to the end and hope that it can upload.



ID: 41545 · Report as offensive     Reply Quote
ProfileKWSN THE Holy Hand Grenade!

Send message
Joined: 9 Apr 07
Posts: 7
Credit: 1,630,807
RAC: 0
Message 41546 - Posted: 27 Jan 2011, 17:40:43 UTC
Last modified: 27 Jan 2011, 17:42:21 UTC

I'm having the same problem, but with a kicker... some of my .zip files are getting through, but others are not!

for example on WU Hadam3p_eu_xml2_1997_1_007010478_1, the zips 1, 4, 7 & 10 are stuck, but 12 and 13 (that I've noticed...) have uploaded... (13 is uploading as I type...)
ID: 41546 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : TRICKLE CANNOT UPLOAD, BUT, SERVER SHOWS IT ALREADY HAS

©2024 cpdn.org