


Free Online Scientific Repository Hits Milestone 111
ocean_soul writes "Last week the free and open access repository for scientific (mainly physics but also math, computer sciences...) papers arXiv got past 500,000 different papers, not counting older versions of the same article. Especially for physicists, it is the number-one resource for the latest scientific results. Most researchers publish their papers on arXiv before they are published in a 'normal' journal. A famous example is Grisha Perelman, who published his award-winning paper exclusively on arXiv."
I Am Forever in Debt to Arxiv (Score:5, Interesting)
I owe a lot of my knowledge to that site. Here's to another 50,000 papers, Arxiv. And another and another and another
Also, the Arxiv Physics blog [arxivblog.com] is a regular favorite in my Liferea news feed account.
Comment removed (Score:4, Interesting)
Re: (Score:1, Funny)
Re: (Score:3)
Re: (Score:2)
That's true for biology and is undoubtedly true for physics as well. The bar is high for papers, a lot of your results by themselves prove nothing but strongly indicate things, confirming or denying your ideas. You can "know" something years before you can say it's true, and although that can be misleading, a lot of times your colleagues will be very interested in it.
Still, an online repository of papers is good for somewhat current stuff, full details, and getting information faster in many cases.
Re: (Score:3, Funny)
Ah, so you're working in the oral tradition, then?
Re: (Score:2)
perhaps that's simply an issue of convention. i don't see why linguistic papers couldn't be written and published in a similar fashion. is there no way to distill the private e-mail lists and informal conference discussions (transcripts) in a formal academic paper? seems like an open scientific/academic repository would benefit all disciplines, just as mailing lists and conference discussions do.
i'm really happy to hear of Arxiv's success (i only heard about the site a month or two ago). this type of open e
Re: (Score:1)
Re:I Am Forever in Debt to Arxiv (Score:4, Insightful)
Well, this is the problem of perception a lot of people have - that scientists are the anti-social ones. Scientists cannot work in a vacuum - we need communications with one another, interactions and a knowledge of other work to get on with our own work. You build off other people's work, use the things recently discovered to move your own work forward, so you need to have constant fast communications of the latest discoveries. Good physicists are always talking to one another, asking about work done, clarifying points and collaborating - just check out how many of those papers have multiple authors, often at separate institutions.
Compare this to a social science/humanity subject where sitting in your ivory tower is basically encouraged, with publications of great single-authored treatises seemingly the only output. They don't need to talk to one another and many are outright hostile to any discussion of their work.
Disclosure: I'm a physicist with an SO in the humanities. The differences in our experiences are incredible - people in my department like each other and work together.
Re:I Am Forever in Debt to Arxiv (Score:4, Interesting)
My room was littered with papers printed off to read on the bus or at work.
A good reason to buy an Amazon Kindle/Apple iPhone/Sony Reader.
Re: (Score:2)
Re: (Score:2)
Most Physics/CS papers use a standard two column format; you could pinch and zoom in to a single column filling the screen width-wise on the iphone; it would probably be decent.
Re: (Score:2)
No.
I thought so too, so I bought the eReader from Sony. I deal with scientific papers alot, printing them and usually never reading them -- a pile here, a pile there.
The Sony product just doesn't cut it. Here's an unordered list of why:
Re: (Score:2)
An e-reader will be worthwhile when it costs $100 or less and is the size of a magazine.
Re: (Score:1)
i'm the first to comment (Score:3, Insightful)
i'll beat all the cynical punch savvy posters to the punch!
that comma is in the wrong place, i see 50,0000. I guess they need another article on properly writing numbers.
Re:i'm the first to comment (Score:5, Informative)
>that comma is in the wrong place
Right. The correct number is 500,000 (not "50,0000").
arxiv.org [arxiv.org] actually says 497,649 as of a moment ago).
Comma is wrong, 0 right [Re:i'm the first to comme (Score:2)
Re: (Score:1)
I thought it was the third zero that was wrong.
Re: (Score:2)
I thought it was the third zero that was wrong.
Looks like a lot of people thought the same thing.
In fact, though, it was the comma that was wrong, the zeros that were right.
Re: (Score:2)
Not insightful, but informative.
He's not showing any insight here. Instead, he's presenting information.
You suck. Now THAT's insight.
Re: (Score:1)
geez, even a fourth grader knows that!
Re: (Score:2)
Re:Also #1 for mathematicians! (Score:5, Insightful)
Re: (Score:2, Informative)
one thing to bear in mind is that it is not peer reviewed, *anybody* can stick *anything* there.
This is true. However, they do have a group of moderators which recategorizes what they think are "merely mediocre, speculative, or erroneous articles". See http://front.math.ucdavis.edu/ifaq#nonsense
Of course, this is not the same as peer-review, but at least it's something.
Re: (Score:2)
Re: (Score:1, Insightful)
Re: (Score:1)
8bit theatre, D&D [youtube.com]
Re: (Score:2)
Re: (Score:2, Informative)
In it he writes:
As an experiment, Greg Kuperberg looked at the publication status of the first 100 papers in theoretical high energy physics posted to the arXiv in December 1998. He found that 81 had appeared in journals, 11 were conference proceedings or invited lectures, and 2 were Ph.D. theses. "Thus at least 94 of the 100 have been blessed by some form of peer review," he concludes.
Re:Also #1 for mathematicians! (Score:4, Interesting)
Note I still think its very valuable for to have a place where non-peer reviewed material can be uploaded as well as peer reviewed but if its not peer-reviewed its a lot more likely to be incorrect somehow and the reader needs to be aware of that.
Re: (Score:1, Interesting)
But that was 1998 where a) the general population was just getting online and b) pretty much only scientists knew about arXiv.
These are valid objections, I agree.
Looking at Oct 2007 for hep-th and assuming that it would be mentioned in the summary is its published or going to be published (and trust me people mention this...), out of the first 25, 12 are published in a journal and or conference proceedings. So less than 50% were blessed by some form of peer review.
As a comparison, I did the same thing in my own field of mathematics. I looked at the first 25 articles uploaded to arXiv in October '07. As for your field, only 12 were either published or were PhD theses.
But, FWIW, from my quick look at them, there were no obious nonsense articles.
Re: (Score:2)
The best nonsense articles are those that require more than a quick look to determine they are nonsense. In any joke, the punchline has to come at the end, not the beginning.
The problem is when someone who is not an expert in the field comes to a site with unreviewed articles. He can't determine in a quick look what is bogus and what isn't. If you are trying to learn about something new, unreviewed papers are a crapshoot.
Re: (Score:1)
This isn't a bug, it's a feature.
If people are looking for quality-filtered articles, they should restrict their search to something other than just "everything in arXiv". If they don't, and take everything in there as gospel, then they're fools and deserve what they get.
ArXiv doesn't put itself out there as a peer-reviewed source; it's pretty up-front about not doing that, in fact. There's a place for peer-reviewed, high-quality sources, but there's also a demand for something else: access to information
Paper must die (Score:1)
But note that there is no impediment in order to publish just-online peer reviewed journals... maybe that's the future or arXiv. Paper must die, it just creates silly troubles... we end needing, for example, sites like JSTOR [jstor.org] in order to access out of print numbers or foreign non imported titles.
Re:Paper must die (Score:4, Insightful)
I am strongly against journal sub fees as I believe which that the knowledge contained in scientific papers (doubly so for public funded ones) should be availible to all and not only accessable to people willing to pay the high cost of a journal subscription fee. CERN is pushing open journals for that very reason and that may evolve into a respected online peer reviewed journal which will compliment arXiv nicely.
Re: (Score:2, Informative)
it is not peer reviewed, *anybody* can stick *anything* there.
I think they've changed things a little bit over time. It does seem like anyone is able to register an account, which would allow them to start submitting papers. But looking at the help pages, I see this on an endorsement system: "Effective January 17, 2004, arXiv.org began requiring some users to be endorsed by another user before submitting their first paper to a category or subject class." They note that this isn't peer review, but it "will verify that arXiv contributors belong [to] the scientific c
Re: (Score:2)
Re: (Score:2)
"This is the major reason why we still unfortunately need paper journals. We need somebody to read it and say yes this follows basic scientific procedures and to the best of his/her knowledge there are no mistakes."
Darwin did not do any of this with the origin of the species and many scientific ideas from the past came out in lay/not overseen books for the reader. The fact that ideas are peer reviewed or not is quite irrelevant to it's truth. In fact peer review is flawed now knowing what we know about hu
Re: (Score:2)
Re: (Score:2)
Re: (Score:1)
50,0000? (Score:2)
Re:50,0000? (Score:5, Funny)
It's half-a-million. CmdrTaco doesn't deal with such large numbers very often.
Re: (Score:1)
Re: (Score:1)
Re: (Score:2)
There are other User ID's besides your you know.
Re: (Score:1)
Re: (Score:2)
Presumably 50,000 math papers. The remainder are a large but poorly numerically defined set of "new math" papers.
Re: (Score:3, Informative)
Re: (Score:1)
Re: (Score:2)
Vegeta! (Score:1)
Vegeta! What does the arXiv say about their number of articles?
It's fifty ten THOUSAAAAAND!!!
Re: (Score:2)
But the Western style of breaking into three digit groups is more common these days.
According to Wikipedia [wikipedia.org] this is also seen in Japan, and India has a rather eccentric
There are interesting differences (Score:5, Interesting)
Here are some in fields I follow :
In astrophysics, almost all new papers appear first in Arxiv.
In planetary physics, some but by no means all papers appear in Arxiv.
In geophysics, basically no papers appear in Arxiv.
I don't know why there are these differences, but there it is.
Re:There are interesting differences (Score:5, Informative)
Re: (Score:2)
Re: (Score:2)
http://en.wikipedia.org/wiki/Imaginationland [wikipedia.org]
Re: (Score:1, Informative)
Answers in Genesis has a creation 'science' journal here [answersingenesis.org].
Unix comes with all of them (Score:1)
Re: (Score:2)
It was written: cat
Re: (Score:1)
Re: (Score:1)
>In geophysics, basically no papers appear in Arxiv.
That's because geophysicists mostly don't grok TeX. Don't know why. Maybe because they start off as geologists?
It's science (Score:5, Funny)
If it's a science publication, should it have hit a kilometer-stone instead of a milestone?
Re: (Score:1)
Both units should be metric. I propose the kilometer-kilogram--which is about .1 milestones.
Re: (Score:1, Redundant)
Re: (Score:2)
Yes, but that's precisely three kiloFarnsworthies less funny.
500,000+ articles (Score:5, Funny)
But the question we are all asking ourselves is
Who got the first post?
The answer is Exact Black String Solutions in Three Dimensions [arxiv.org] by James H. Horne and Gary T. Horowitz
Slightly better than the "Fkrst Pist" attempts on Slashdot!
How significant (Score:3, Insightful)
Re: (Score:2)
Considering they were started in 1991 and have now only gotten to 500,000, this is significant.
Re: (Score:2)
int quantity = 0;
if (quality == quantity)
{cout "Yes it does.
Like anything else: quantity and ease of access (Score:5, Insightful)
I realize that you were being snarky, but you accidentally hit on a corner of the truth. The real value of the ArXiV is indeed its quantity of results, mixed with the ease of access. The traditional journals typically restrict access to their output -- unless you are at a subscribing institution, it costs $15-$50 to access a single article from a single traditional scientific journal (depending on publisher). At professional institutes and universities, which typically have online subscriptions to journals, it is possible to surf through the Literature (depending on field, back about 10-15 years) and find recent relevant knowledge extremely quickly. If you aren't at an institution that subscribes, you're SOL. ArXiV fixes that - if you publish your article both in a journal and in the ArXiV, most indexing services will notice that it is the same, and suddenly everyone on the planet has unrestricted access. That's a no-brainer for an author.
The way that professional scientists (like me -- I am a solar astrophysicist) access the Literature has changed drastically in the last ten years. My office has about 12 linear feet of Xeroxed journal articles in three-ring binders, but I practically never refer to them. It's far faster and more convenient to access (say) the entire archives of Astrophysical Journal online than to go "grep dead trees" at the library. Citation indices such as ADS (Google for adsabs) hyperlink both references and citations, so that I can search through 50 articles relevant to a topic in less time than it used to take to look up one article and Xerox it for reading outside the library.
Old-style pay-to-read journals get in the way of that rapid access - for example, I have rarely cited articles in Astronomy and Astrophysics, because it's a pain in my ass to download them. Until recently, my institute didn't subscribe, so I had to either pay on a per-article basis (which adds up if you are skimming for the one relevant article in a dozen possibilities), or travel to the local university to get the paper I wanted. This is a very common problem: even large universities generally don't subscribe to all the relevant journals in a given field, because web subscriptions cost thousands to tens of thousands of dollars per year per journal!
For everyone not fortunate enough to have a computer account at a large institute that can actually afford to subscribe to dozens of journals, ArXiV is the best way to access a large volume of the literature. Hence, articles posted to the ArXiV get cited more. That makes authors want to post to the ArXiV as a matter of course. It's a virtuous circle.
So, er, yes, quantity is quality in this case -- ArXiV was canny and/or lucky enough to get a critical mass of good work, and the quantity is the driving force that keeps the whole thing going.
Re: (Score:2)
A solution (Score:1)
Hopefully this helps... (Score:4, Insightful)
I'm not going to pretend 50,000 is a lot, but the fact it's 50,000 and growing should make them worry. I hope the celebration of this milestone will help accelerate it's growth so we see 100,000 sooner than later. The quicker pay-for-access science disappears the better for all of us.
Re: (Score:3, Informative)
Re: (Score:1, Redundant)
500,000
Re: (Score:2)
Many journals do let the authors publish elsewhere, as a matter of course. (Astrophysical Journal is one.) Others can be strong-armed. The copyright agreement they send is not just a formality, it is the actual terms under which the authors license the work to the journal. I routinely write in that I retain a non-exclusive right to re-publish. Haven't had problems with that yet.
What about peer-review? (Score:1, Interesting)
I've seen that they've started a system where you need an endorsement from another arXiv author to post a pre-print, but is an endorsement enough, considering the likely fact that endorsers don't really check the paper properly?
Re: (Score:2)
"It seems that a lot of people follow their field by reading pre-prints posted to arXiv. Isn't this kind of dangerous, considering the lack of peer-review?"
Peer review is great for some things, but just ask Galileo how 'peer review' worked for him. 7 years in a prison as a part of the inquisition. I do realize, that today scientific breakthroughs are treated a little differently today, unless you're talking about Genetic Engineering, which has it's own set of inquisition style prohibitions.
but yeah even oth
Re: (Score:3, Informative)
Peer review is great for some things, but just ask Galileo how 'peer review' worked for him. 7 years in a prison as a part of the inquisition. I do realize, that today scientific breakthroughs are treate
Just a note, Galileo's trial by the inquisition was not a problem of peer reviewing: it wasn't that he couldn't get his work published; it was what happened after it was published.
Well done! (Score:1)
Congralculations on that SCIgen benchmark!
Fifty ten-thousand? (Score:3, Funny)
Re: (Score:1, Interesting)
You jest, but... (Score:1)
You jest, but if what very, very little I understand of Japanese is in order... Well, maybe our great Taco has merely been watching too much anime.
In Japanese, ten-thousand is "man" (pronounced with an "a" somewhat like the "a" in "father": "mahn"). What we would call "five hundred thousand" would instead be called "go-jyu man" ("go" = "five", "jyu" = "tens", and "man" = "ten-thousands": "five tens, ten-thousands"). So, basically, "fifty ten-thousands" would be a fairly accurate English representation of
In other news... (Score:3, Interesting)
Archive this (Score:1)
The arXiv is great, but..... (Score:3, Interesting)
We really need to begin compiling our scientific knowledge into a hyperlinked wiki/database of sorts.
Wikipedia's great for basic stuff, though there's still gobs of information (much of which is in the public domain) that's inexplicably confined to books and journals.
Hyperlinks (and extended data sets) should be *standard* for all journal articles these days, given that we have the technology to do so. There's no reason that the arXiv needs to remain as a repository for dead-tree PDFs.
Re: (Score:3, Informative)
At some level, hyperlinks (at least) are standard. They're called "references" and were the closest thing to a hyperlink before the intertubes were invented. Several free services (ADS is one: http://adsabs.harvard.edu/ [harvard.edu] have spiders that walk the literature and create genuine URL-style links between articles. ArXiV is advancing custom along that path, by making many journal articles available for linking to anyone free of charge.
Extended data sets are coming. Astrophysical Journal allows online publicat
Not a search engine. (Score:2, Informative)
To clarify, arxiv is a document repository (you submit your papers there). If you want a scientific papers search engine, use citeseer [psu.edu].
Note that citeseer also indexes arxiv documents :)
Is there peer review? (Score:2)
Re: (Score:2, Informative)
XXX.LANL.GOV (Score:3, Interesting)
was the original .. with the skull/crossbones icon. Now its all too easy and happy looking.
Forbidden knowledge (Score:2)
You know only terrorists need scientific information.
Wiki-like research (Score:2)
Re: (Score:3, Informative)
Here's how it works (for me at least):
First you write a paper - this is the hard part. Then you can submit it to Arxiv - usually done at the same time as submission to a journal, though some choose to wait for any initial backlash/corrections before doing this. Arxiv normally publishes it the next working day with no peer review (8pm EST the night before) for all to see online. Meenwhile your journal is still looking for peer reviewers. No journal in physics can now ask to be the sole source for any article