discussion list corpus project

Postings | Mary Lynn Hughes | May 12th, 2002

Dear All,

In response to James saying how useful he’d found previous postings re his GE questions, I thought I’d pass on the following, in case anyone’s interested.

I’ve now finished compiling a corpus of postings from the MSc list (c225,000 words), going back to late 1998 (with gaps here and there) and while I was doing it, I kept thinking ‘What a wonderful list this is!’ (Those of you who’ve been around long enough will remember how slow it was getting off the ground). There are so many interesting discussions, ideas for research and assignments, references, teaching ideas and occasional theoretical disputes, as well as the personal experiences, humour, study tips, anxiety, commiserations, etc that make it so human and alive.

I only wish I had it all catalogued by topic, so I could quickly find things when I want them. (Originally, I tried to do that, but there were too many messages with more than one subject and separating them out was too difficult and time-consuming, so I gave it up and lumped them all together). If anyone had the energy to do that and it could be collected in one place (a website?), it would be a great resource for people coming onto the course (along with assignments?).

Anyway, if anyone wants a copy, I’d be happy to send it to you. The corpus files (.txt format) amount to about 1.5 MB, so it could be sent via email in a couple of tranches. As well as using it for study reference, there are loads of possibilities for analysing the corpus, e.g. for TDA, LEX, IIC, CL or other modules.

Hope you’re all making progress under less-stressful circumstances.
Cheers,
Mary Lynn

 

Archive Categories