DMFA forum redirected.

Started by Amber Williams, September 12, 2006, 12:43:58 AM

Previous topic - Next topic

Amber Williams

Just a heads up, the DMFA pages now point to this forum instead of the Nice.  Which means Phase 2 is underway.  Anyone still wanting to get stuff off the old forum best do it now or very soon.  Also, this will likely be the last week or so where DMFA people can get their old postcounts back.

In about 2 weeks I will likely request the old forum on the Nice be removed.

Aridas

I plan to be crazy-obsessive and get the forum archived, hopefully before then. That way, anyone who misses something they want will have a mirror site... technically.

Tapewolf

Quote from: Aridas Soulfire on September 12, 2006, 01:15:36 AM
I plan to be crazy-obsessive and get the forum archived, hopefully before then. That way, anyone who misses something they want will have a mirror site... technically.

I would appreciate that.  I still find myself looking things up on The Nice fairly regularly.

J.P. Morris, Chief Engineer DMFA Radio Project * IT-HE * D-T-E


Zedd

Well now ms Amber....I do hope it wont be too much a hurry doing that

Amber Williams

Considering how long it took for me to change the links on the main page to the new forum...I wouldn't expect the old forum to vanish overnight.

Azlan

*poof* Old forum now gone...

~forum faerie
"Ha ha! The fun has been doubled!"

Zedd


llearch n'n'daCorna

Quote from: Amber Panyko on September 12, 2006, 03:13:14 PM
Considering how long it took for me to change the links on the main page to the new forum...I wouldn't expect the old forum to vanish overnight.

*grin* but we like you like this, Amber. If you got all efficient and stuff, we wouldn't recognise you anymore :-)
Thanks for all the images | Unofficial DMFA IRC server
"We found Scientology!" -- The Bad Idea Bears

Aridas

Ok, that didn't work too well. my program hit its limit and won't do any more. Is anyone experienced with, or at least potentially capable of saving offline site mirrors? I can give you the page to start at to get the entire DMFA forum as opposed to the past 45 days, but you'll have to figure out the rest yourself and.. well... find something without a page limit. I can't find any that's easy enough for me to use.

Either that, or we beg Amber to let the forum stay up for all eternity.

llearch n'n'daCorna

wget -q -nc -w 1 --random-wait --limit-rate=50k -nH -E -m -p -np $URL

?
Thanks for all the images | Unofficial DMFA IRC server
"We found Scientology!" -- The Bad Idea Bears

Aridas

someone else can go get it, i've put a bit too much effort into it as it is. >.>
besides, aren't we going to need some sort of limity thing to keep it from leaking into things other than the dmfa section and the images displayed from other sites? I had something doing that with enough ease of use that even I could figure it out, but I need $2000 to get a version with a high enough limit to get the whole thing. >_>

But anyway, for those of you wondering though, you'll be able to archive it starting from this url:

http://nice.purrsia.com/cgi-bin/ultimatebb.cgi?ubb=forum&f=79&DaysPrune=1000&submit=Go

llearch n'n'daCorna

#11
in process, then.

Currently on 761 pages, 66Mb downloaded....
Thanks for all the images | Unofficial DMFA IRC server
"We found Scientology!" -- The Bad Idea Bears

Tapewolf

Quote from: llearch n'n'daCorna on September 14, 2006, 05:04:51 AM
in process, then.

Currently on 761 pages, 66Mb downloaded....

It never occurred to me that you could use WGET on a forum.  I'll have to give that a go..

J.P. Morris, Chief Engineer DMFA Radio Project * IT-HE * D-T-E


Gabi

Wow, llearch, which server are you hosting that on?
~~ Gabi a.k.a. Gliynn Starseed, APF ~~
Thanks to Silver for the yappities, and to everyone for being so great!
(12:28:12) llearch: Gabi is equal-opportunity friendly

llearch n'n'daCorna

#14
Uh. None, yet. It'll be on my personal server, once I get the machines swapped over, maybe, if i feel that way inclined.

Don't hold your breath. :-)

it's up to 8500 files, 213Mb.


I think it needs some help, though, as it's getting confused...

Nuts. wget won't do it, as it won't filter on the query string. Which means i'll have to come up with something more complex... :-/
Thanks for all the images | Unofficial DMFA IRC server
"We found Scientology!" -- The Bad Idea Bears

Aridas

Teleport Pro was doing an excellent job of saving my pages, til it decided to give up.

Sid

#16
Quote from: llearch n'n'daCorna on September 14, 2006, 11:04:22 AM
Nuts. wget won't do it, as it won't filter on the query string. Which means i'll have to come up with something more complex... :-/

I once (March 31) used cURL on the Print View pages in batches of 50 (to only produce short bandwidth spikes as opposed to a freaking huge OMGWTF block). It worked well from what I can see, and I'll do it again on the weekend. Not stopping anybody else from doing the same or similar things, just doing it because I want to have a local archive anyway.

If anybody is interested, I could make a ZIP file of those pages afterwards and host it. *shrugs*
:boogie

llearch n'n'daCorna

#17
There's an idea.

*codecode* ok, it's whipping away. I'll have to manually get the css and stuff, but that's not too hard... heck, I could wget a singe page and get all that.

I'll let you all know where it is when I'm done.

Edit:
Oh, and just in case anyone is interested:

pushd /home/llearch/nice
for i in $(seq -w 44 3660); do
    curl --create-dirs -f -# -o "cgi-bin/ultimatebb.cgi?ubb=print_topic;f=79;t=00$i.html" --url "http://nice.purrsia.com/cgi-bin/ultimatebb.cgi?ubb=print_topic;f=79;t=00$i"
    sleep 5
done
popd
Thanks for all the images | Unofficial DMFA IRC server
"We found Scientology!" -- The Bad Idea Bears

skwerly

Welp, I managed to whip Mr Aridas's Teleport Pro into shape, and got all 3655 printer-friendly topics archived.  They weigh in at about 120MB, with another 4000 images (200MB) to go with 'em.

I also grabbed the 102-page topic list, and thought it'd be neat to combine 'em all into one page.  Uh, that didn't work out so well.  I ended up with a 5MB takes-forever-to-load html file.  Looks purdy though. :)

Teleport chokes on the regular version of the forum (waaay too many url references), so I can't help there.  I'm sure Mr llearch or someone has a way to pull it off.  If not, well, at least we'll have something..

Say, while I'm here, does anyone have the images from this topic, where Ms Amber was showing us the process of drawing Mischa?  It's pre-Haxored, so all the links are broken now.  That's the trouble with webcrawlers these days.  They just don't want to archive stuff that's not there anymore. *sigh*

Damaris

The Professor painstakingly recreated that thread in the Art Forum, I do believe.  It has the pictures as well.

You're used to flame wars with flames... this is more like EZ-Bake Oven wars.   ~Amber
If you want me to play favorites, keep wanking. I'll choose which hand to favour when I pimpslap you down.   ~Amber

Sid

#20
I also got the 120MB package and a small index (no fancy stuff, just a LONG list with the thread names and links - ordered by creation date and with "200x" headings whenever a new year came by). Image URLs are non-local (so it tries to get the images from the remote locations). The folder compresses neatly into 25MB (even though that's just using my default archiver's default options - I might be able to squeeze it more with other means, who knows).

Oh, and the tutorial is at http://clockworkmansion.com/forum/index.php?topic=719.0
:boogie

Amber Williams

Since you guys seem to have a good handle on things, I'm going to announce that when I return from the States, I'll be making the formal request for the Nice DMFA forum to be removed. (which might take a few weeks to get done for all I know)

I'll be leaving next Mon-Tues and expected to return a week from then if not sooner.

skwerly

I'm making one last attempt at grabbing the regular version with Teleport, using hideous restrictions so it'll stop wandering into too many URLs I don't want and crashing.

Why am I doing this?  Because the printer-friendly version doesn't have avatars or signatures.  And really, what's the point of a forum without avatars and signatures?  No one believes it when you say "I only visit them for the articles!" anyway.

Meanwhile, the Purrsia admins are screaming "STOP WASTING MY BANDWIDTH ALREADY!!"

*cringe*

Damaris, Sid, and Prof: Thank you! *happysnoopydance*

Ms Amber: Enjoy your trip, good luck n all that.

Tapewolf

#23
Sid, I'd be interested in having a copy of your archive if that's not a problem.

**EDIT**
INTERESTED, not INTERESTING dammit!  Don't post in a hurry...

J.P. Morris, Chief Engineer DMFA Radio Project * IT-HE * D-T-E


Kasarn

Quote from: skwerly on September 21, 2006, 11:57:53 PM
Meanwhile, the Purrsia admins are screaming "STOP WASTING MY BANDWIDTH ALREADY!!"

What happened when somebody ganked Monoceros Media...
http://tugrik.livejournal.com/496847.html

Of course, there was also the time Tug was slashdotted... that caused The Nice to die for a day...

Sid

#25
Quote from: Tapewolf on September 22, 2006, 06:17:49 AM
Sid, I'd be interesting in having a copy of your archive if that's not a problem.

Going to leave the house in a few (seminar... so full of hate...), but I'll upload it once I return (early evening, I guess). :)

Edit:
http://www.youkai.de/DMFA_Forum_(Sep_17).zip
- Roughly 25 MB (uncompressed size: roughly 120 MB)
- Contains (as far as I was able to verify without going through all threads by hand) the entire Forum in PrintView Mode
- File contains no images, all image URLs point to the original URLs (Advantage: You can download only this file without getting tons of broken images. Disadvantage: Net connection needed for full experience, threads like PAY need LONG to load due to suddenly having to load dozens of images.)
- Contains "index.htm" that shows all thread titles (sorted by thread creation date), links, and when a new year started
- Not sure if this extracts to a new folder automatically. Better be on the safe and put the ZIP in an empty folder before doing careless "Extract all here" actions. :P
:boogie

skwerly

Kudos, Sid!

Okay, while Sid here's been distributing a printer-friendly archive, I've been trying to beat Teleport Pro into submission over the "normal" version.  What do we call that, anyway? Non-printer-friendly? Graphic-intensive? Whatever.  Anyway, I think I finally got it.

Problem is, TP simply can't handle all the URLs from such a large project, and even trying to break it into chunks has it choking on the multi-page threads. (APF, anyone?)  Sooo, the only thing I could come up with is to break it into the smallest chunks possible: one project per page.  It's almost like telling my browser to Save As.. Complete Web Page, but more automated and much less boring.

Anyway, the batches (8000 of 'em!) are running now, one at a time, and should be ready by.. um.. Tuesday.  I'm doing only one thread with a delay between files so the Purrsia admins don't scream so loud.  Then I just mash it into a "flat" archive and delete the billions of duplicate images.  I better free up some disk space. :)

I suppose I should ask if there's anyone (besides me) who'd be interested in the full graphic version once it's done.

Aridas

It would be meeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee yes indeed

bill

Are those extraneous "e"s necessary?

Aridas