climateprediction.net (CPDN) home page
Thread 'Trickles not showing.'

Thread 'Trickles not showing.'

Message boards : Number crunching : Trickles not showing.
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 5 · Next

AuthorMessage
Profileadrianxw
Avatar

Send message
Joined: 31 Aug 04
Posts: 145
Credit: 2,080,724
RAC: 753
Message 32022 - Posted: 5 Jan 2008, 21:10:46 UTC
Last modified: 5 Jan 2008, 21:12:55 UTC

I picked up this wu a couple of days ago. I can see it sending trickle_up messages, yet the result shows \"No Trickles\".

I saw that there had been some credit problems around Christmas, but can also see that my team mates trickles are appearing normally.

Anything to check?
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 32022 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 32023 - Posted: 5 Jan 2008, 21:24:57 UTC


There are some problems with the servers at certain times of the day, but, provided that the trickles have left your computer, and you got \"succeeded\" after the \"schedular request\", then it\'s just a matter of waiting a few hours.

The credit problem was just the export file not getting created on time, and missing the stats site update time for that day.

ID: 32023 · Report as offensive     Reply Quote
ProfileMikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 32025 - Posted: 6 Jan 2008, 1:01:45 UTC - in response to Message 32022.  
Last modified: 6 Jan 2008, 1:07:42 UTC

...
Anything to check?


Usually it\'s just a case of waiting, as Les says. The other thing you need to keep in mind is that there are a couple of different types of trickle - one merely means \'I\'ve resumed running\', and doesn\'t count towards credit or get displayed.

What percentage through is your model? Slab models have 72 trickles in total, so it must be at least 1.38% of the way through before the first \'real\' trickle is sent.

The trickle updates are lagging behind by at least two hours (I had one sent at 22:42, and it\'s registered on the site at 00:29).

I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 32025 · Report as offensive     Reply Quote
Profileadrianxw
Avatar

Send message
Joined: 31 Aug 04
Posts: 145
Credit: 2,080,724
RAC: 753
Message 32030 - Posted: 6 Jan 2008, 10:16:58 UTC
Last modified: 6 Jan 2008, 10:23:08 UTC

02-Jan-2008 21:17:27 [climateprediction.net] Sending scheduler request: To send trickle-up message.
02-Jan-2008 21:17:32 [climateprediction.net] Scheduler request succeeded:
03-Jan-2008 01:03:06 [climateprediction.net] Sending scheduler request: To send trickle-up message.
03-Jan-2008 01:03:15 [climateprediction.net] Scheduler request succeeded:
03-Jan-2008 17:21:23 [climateprediction.net] Sending scheduler request: To send trickle-up message.
03-Jan-2008 17:21:28 [climateprediction.net] Scheduler request succeeded:
03-Jan-2008 18:14:24 [climateprediction.net] Sending scheduler request: To send trickle-up message.
03-Jan-2008 18:14:27 [climateprediction.net] Scheduler request succeeded:
04-Jan-2008 21:22:37 [climateprediction.net] Sending scheduler request: To send trickle-up message.
04-Jan-2008 21:22:42 [climateprediction.net] Scheduler request succeeded:
06-Jan-2008 10:48:28 [climateprediction.net] Sending scheduler request: To send trickle-up message.
06-Jan-2008 10:48:34 [climateprediction.net] Scheduler request succeeded:

I have trimmed the \"got 0 new tasks\" and \"requesting...\" parts of the messages to prevent phattening.

The models have changed since I last ran CPDN. It used to run with 30% quote on a 3.2GHz Prescott HT. In that rig, it got pretty much 1 trickle per day. It is now running at 25% on a 2.4GHz Q6600, so basically has exclusive use of 1 core. I would have expected significantly more to be done per 24 hour period in this rig.

These \"I\'ve resumed\" trickles must be something new, I don\'t recall them when I was running CPDN before. The \"real\" trickles must be larger then the old. The model is at 1.022% so is probably why there are no trickles.

I can see other people getting trickles reasonably regularly however, and getting 311 credit for them. See link in previous post.


Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 32030 · Report as offensive     Reply Quote
ProfileMikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 32031 - Posted: 6 Jan 2008, 10:55:55 UTC
Last modified: 6 Jan 2008, 11:16:30 UTC


There are three different types of model, only the HadCM3 (coupled model) gets 311 credits per trickle. Yours is a HadSM3 (slab) model. Note that since you have a quad-core, and only 1GB of RAM, you need to keep an eye on memory usage - the HadAM3 (SAP) models would not be appropriate for your PC.

There are a couple of very interesting posts describing the models. The links can be found in the \'README - running the model\' (link via my signature).

All three types send the \'I\'ve resumed\' trickles, but they only get sent in particular circumstances.

If you notice, the trickle messages are being displayed erratically, rather than at a regular interval. For example, one at 17:21, and the next at 18:14. I\'d be expecting them to be sent at a regular interval every 4 hours or so (guessing from my overclocked Q6600 Linux system).

* Is there anything immediately prior to the \'to send trickle-up message\' in the log? Perhaps a suspend/resume line, or an \'exited with zero stats\' line?

* Are there any \'waiting for memory\' lines? How much memory do your other projects typically take per workunit?

* Do you have \'keep in memory\' turned on or off (website/your account/view general preferences)?

* How much CPU time is showing in the Boinc manager for that model?

* Does the %complete go up steadily, or is it jumping about? (i.e., losing work?)

I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 32031 · Report as offensive     Reply Quote
Profileadrianxw
Avatar

Send message
Joined: 31 Aug 04
Posts: 145
Credit: 2,080,724
RAC: 753
Message 32034 - Posted: 6 Jan 2008, 12:30:06 UTC
Last modified: 6 Jan 2008, 12:32:13 UTC

The 1GB is an unfortunate temporary situation. When I ordered the components for this system, I had ordered 2GB of Corsair 4.4.4.12 RAM. Upon building the system and testing it, Memtest-86 v3.3 showed a fault in the RAM, which moved upon swapping the modules around. It has been sent back. The result is that the machine running CPDN is running with a stick stolen from this machine, (another Q6600). At this time, the system is not overclocked, although as it is a G0 chip and has a huge Zalman on it, on a top-notch MoBo, it probably will be when the right memory arrives.

There does not appear to be any commonality between the messages preceeding the trickle-up. There are 10 projects on the machine, and most of the messages refer to regular events from these. Suspending, Restarting, Uploading - your familiar with this stuff. Just searched stdoutae.txt, no memory issues, just the confirmation of memory usage when BOINC starts and reflects the defaults for the location of the machine. No apparent errors.% done seems to be gradually increasing, not jumping about.

Is showing 25% CPU use, as expected. Similar resource figures as POEM, about half of what Rosetta is using. Remember, the model is fairly new, ISTR that the demands go up as the model progresses. Leave in memory is set as usual, no graphics running.

If, as you say, it will not generate a \"real\" trickle until it gets to 1.38% then that is probably why, since it has not. I am suprised to see it performing as poorly as that however when other members of my team with regular hardware are getting scoring trickles every day or so. If the machine trickled now and got 311 for it, that would be a credit/hour of less then 5, hugely below what it is acheiving anywhere else.


Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 32034 · Report as offensive     Reply Quote
ProfileMikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 32035 - Posted: 6 Jan 2008, 12:50:15 UTC
Last modified: 6 Jan 2008, 12:58:39 UTC


A slab model should typically be trickling around six times per day given your processor (when not overclocked). Around 23 c/h. Mine complete in about ten days with seven or eight trickles per day on the overclocked box.


I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 32035 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 32036 - Posted: 6 Jan 2008, 13:18:24 UTC


What is the s/TS as displayed in the graphics?
That will give us a figure to compare with others.

You\'re running a \'slab\' model, where as your team mate (as per your first post), is running a Coupled Ocean model. The 2 types have different trickle intervals, real and model time, so they can\'t be compared.

ID: 32036 · Report as offensive     Reply Quote
Profileadrianxw
Avatar

Send message
Joined: 31 Aug 04
Posts: 145
Credit: 2,080,724
RAC: 753
Message 32039 - Posted: 6 Jan 2008, 13:33:48 UTC

It says in the graphics that it is on step 8069 of 259248, 1.04% crunched, and 71:19:41 of CPU time to get there.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 32039 · Report as offensive     Reply Quote
ProfileMikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 32040 - Posted: 6 Jan 2008, 14:54:41 UTC


That works out to be 6,860 hours for 100%... something is definately wrong there.

I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 32040 · Report as offensive     Reply Quote
Profileadrianxw
Avatar

Send message
Joined: 31 Aug 04
Posts: 145
Credit: 2,080,724
RAC: 753
Message 32042 - Posted: 6 Jan 2008, 15:26:36 UTC

I have just run several passes of the diagnostics I use, and the h/w seems to check out. What is odd is that I can\'t access the Windows EventLog. I am re-installing Windows at the moment.

When I first built it, loaded Windows and BOINC, I left it running for 3 days crunching just SIMAP. It was RAC\'ing over 2000 a day which is about right.

<fx>Shrugs</fx> Beats me what\'s up. It is ripping through Rosetta, SIMAP, MCDN and POEM. There are a couple of other projects on there with very small quotas, but even they are churning work out.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 32042 · Report as offensive     Reply Quote
ProfileMikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 32043 - Posted: 6 Jan 2008, 15:43:21 UTC


You could try downloading a coupled model instead (tick \'HadCM3\' on the \'your account/view CPDN preferences/edit\' page).

I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 32043 · Report as offensive     Reply Quote
Profileadrianxw
Avatar

Send message
Joined: 31 Aug 04
Posts: 145
Credit: 2,080,724
RAC: 753
Message 32044 - Posted: 6 Jan 2008, 16:34:34 UTC

I\'ve suspended CPDN right now. I\'ll play around with it a bit more as time permits. Some of the numbers I needed have come out of this thread anyway, so I know what I should be seeing.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 32044 · Report as offensive     Reply Quote
Profileadrianxw
Avatar

Send message
Joined: 31 Aug 04
Posts: 145
Credit: 2,080,724
RAC: 753
Message 32186 - Posted: 15 Jan 2008, 20:55:39 UTC
Last modified: 15 Jan 2008, 20:56:08 UTC

Following up a bit, I found an issue with the disk drive, (more specifically with the disk drive cable), which was generating a few errors in the event log. This is now fixed so I switched the model on again to see if that was the cause.

When I did, it dropped from 1.101% 78:38:20 back to 1.092% 77:38:05, (machine had been down as I was replacing the cable). I let it run for an hour and it went up to 1.101% at 78:41:02 so basically it had crunched 0.009 in just over an hour.

If the trickles come at 1.38% intervals, that gives ~153 hours per trickle, or roughly 2.2 credit per CPU hour, which is a long way from the 23 c/hr quoted earlier.

The assumption there is that the progress per unit time is a straight line - is that a fair assumption, (or within reason anyway)?

If so, I do not understand the stupifying under-performence of the wu on this box. Projects like POEM, Einstein, Rosetta, MalariaControl, SIMAP etc. are producing much the same RAC as my other Q6600.

It is not lack of resources. The task is using less physical and virtual memory then several of the other projects. The 2 machines have different MoBo\'s but one is a top notch Asus, the other an equally good MSI. RAM is the same, disks are the same, OS is the same, graphics cards are the same, heatsinks are the same, clocking is the same. Cabinets are different colours ;)

Weird stuff going on in North Sjælland.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 32186 · Report as offensive     Reply Quote
ProfileMikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 32187 - Posted: 15 Jan 2008, 21:19:19 UTC


I\'d try a different model instead, it might be that your WU has become corrupted as a result of the disk problems. Your PC should be good for both HadCM3 and HadSM3 (coupled and slab respectively), although I wouldn\'t recommend HadAM3 until your memory stick is fixed since it uses a lot of RAM. Once you\'re back to 2GB then HadAM3 is a good choice.




I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 32187 · Report as offensive     Reply Quote
Profileadrianxw
Avatar

Send message
Joined: 31 Aug 04
Posts: 145
Credit: 2,080,724
RAC: 753
Message 32190 - Posted: 15 Jan 2008, 22:35:33 UTC

Okay, I\'ll abort this one, which I hate to do, and get another to see if it performs better.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 32190 · Report as offensive     Reply Quote
Profileadrianxw
Avatar

Send message
Joined: 31 Aug 04
Posts: 145
Credit: 2,080,724
RAC: 753
Message 32191 - Posted: 16 Jan 2008, 8:35:32 UTC
Last modified: 16 Jan 2008, 8:38:50 UTC

Grabbed a new model, this one, an \"SM\", and it appears to be running as expected. It has sent a scoring trickle, which upon back calculation gives a c/hr of 23 as expected.

I will allow the model to complete and then force it to run a \"CM\" to see if the problem was specific to the original configuration of the machine, (and/or any resulting fallout), or generic.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 32191 · Report as offensive     Reply Quote
ProfileMikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 32194 - Posted: 16 Jan 2008, 12:36:56 UTC


Glad things are working now! It\'ll be interesting to know how the HadSM3 behaves.

I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 32194 · Report as offensive     Reply Quote
MW

Send message
Joined: 11 Dec 06
Posts: 46
Credit: 5,034,990
RAC: 0
Message 32224 - Posted: 19 Jan 2008, 9:38:34 UTC
Last modified: 19 Jan 2008, 9:57:53 UTC

My Slab Model is around 9.9% and no trickles registered. Any ideas? After a little digging, am I right in thinking this model has been worked on by at least ten different computers?
ID: 32224 · Report as offensive     Reply Quote
ProfileMikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 32225 - Posted: 19 Jan 2008, 10:19:36 UTC


Hi MW,

The most likely cause is that there is a firewall, AV, or proxy server which is blocking trickles from the model being uploaded to the servers.

Is there anything in the \'transfers\' tab? (probably not, since you\'re not yet at 33%).

Is there anything in the \'messages\' tab shown when it tries to do a trickle-up message?

Which firewall do you use? Which antivirus do you use? Which proxy server do you use? (if you\'re on a work or university computer or other large network, you\'ll probably be using a proxy server)


I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 32225 · Report as offensive     Reply Quote
1 · 2 · 3 · 4 . . . 5 · Next

Message boards : Number crunching : Trickles not showing.

©2024 cpdn.org