Software Sorts Electronic Evidence 85
securitas writes: "The New York Times has a very interesting article about the legal industry using new search software to sort through electronic evidence such as e-mail, documents and recovered files, and the process that they go through to make the evidence usable. It has spawned an industry."
grep (Score:5, Funny)
Re:grep (Score:1)
%
Damn.
Re:grep (Score:1)
Re:grep (Score:1)
patent #afdvg5q9j4qt
Use of existing programming techniques to produce filtered text in a way anyone else could have tought up only I patented it first.
Re:grep (Score:2)
Information Overload (Score:1)
Software to sort intelligently has been around for quite some time- googles 'i feel lucky' link is probably a great example of this. The number of times i've hit what i'm looking for with a few judicious '+'s using google is unbelievable.
Sounds almost like they are reinventing the wheel...
policy to the rescue (Score:2, Interesting)
Yes, but what kind of electronic format? (Score:2, Interesting)
Re:Yes, but what kind of electronic format? (Score:1)
Progress! (Score:1)
*golf clap*
-- Brett
Useful against spam? (Score:3, Insightful)
Re:Useful against spam? (Score:1, Funny)
Re:Useful against spam? (Score:3, Informative)
Re:Useful against spam? (Score:2, Interesting)
alternate link (Score:1)
;)
-- Brett
Re:alternate link (Score:1)
The amount of evidence is a problem (Score:4, Interesting)
Re:The amount of evidence is a problem (Score:2)
Biased Searches? (Score:3, Insightful)
Re:Biased Searches? (Score:1)
Waiting for my 20 seconds to expire....
Re:Biased Searches? (Score:2)
Link (Score:3, Funny)
In the future (Score:2, Funny)
Lawyer Eliza: (Walks up to the witness) So how are you today?
Witness: I am fine thank you.
L Eliza: How long have you been fine thank i?
Witness: I don't understand the question...
L Eliza: Don't you really understand the question?
Witness: That's right.
L Eliza: Is it really that right?
Prosecutor Eliza: Objection Your Honor! She is harassing the witness!
Judge Eliza: Why are you concerned about my honour she is harassing the witness?
and so on...
Name of the software? (Score:2, Informative)
The only company i see is OnTrack Data International. Anyone?
I'm interested here cuz I do tech/litigation support in a law firm.
The article does miss one important little detail. The first level of sorting is done by clerks or paralegals. Associates do the law-related grunt work, but that's AFTER someone making $10-$15/hour has gone through and sorted out the pr0n(trust me, lawyers get A LOT!), and other pointless crap.
The pointy end of the search problem. (Score:5, Interesting)
1) we had a case of an outgoing elected official low level formatting their HDD on the way out the door. Had to be sent out to a special data recovery lab. (they can do some amazing things with scanning electron microscopes on half tracks and such)
2) there are stacks and stacks of 8" floppy disks, in formats like IBM DisplayWriter, and other chunks of physical hardware that haven't been seen by mortal man in 20 yrs.
Finding a chunk of info is damn tricky, but after you find it, you have to find something that can read the punchcard/papertape/magtape/floppydisk/harddisk in question. And due to a querk in how the original act was written (keeping in mind that these things were written back when data was carved on rock slates and format isn't a big consideration) we were required to keep it in its original form.
I feel for someone with my job in 50 yrs. I ran away from govt work after that. It was scary!
One plus side. EMP has a hard time taking out papertape!
Re:The pointy end of the search problem. (Score:1)
there are stacks and stacks of 8" floppy disks, in formats like IBM DisplayWriter, and other chunks of physical hardware that haven't been seen by mortal man in 20 yrs.
Hell, just try to find a 5 1/4" drive these days.
Re:The pointy end of the search problem. (Score:1)
Those were the days.
This isn't new (Score:2)
The leader in the industry is a Company called Megaputer [megaputer.com], and their clients included the US government, Boeing, the CDC, and many large companies.
It's not merely grep you halfwits (Score:2, Funny)
To efficiently work with several years worth email, more advanced techniques are required. Specifically, you need a text indexing program tied to a relational database. While this doesn't give you any more power than recursive searches using the grep and find combo, it's much much faster as your keyword and message attributes can use b-tree index lookups and a cost based optimizer to reduce disk reads.
That being said, it's still not that impressive of software. I'm certain that I could build the search component in a couple of weeks using Microsoft SQL Server (with the neat full text indexing feature) and a moderately adept gui developer could hammer out a decent interface in the same amount of time.
Still, there's a difference between "trivial to implement with any decent rdmbs" and "I can do it with a 2 line bash script". You would do well to remember it.
Your friend,
--Shoeboy
Re:It's not merely grep you halfwits (Score:1)
For ineractively searching gigabytes of files, we have:
find . -exec grep "cum-stained blue dress" {} \; -print | more
search software is good (Score:1, Troll)
If we really want the Internet to permeate into our lives, then it should go into our lives as they really are. Perhaps some people will be less wary about leaving evidentiary data lying about on the Net.
When we decry against censorware, or searchware or whatever, we are decrying a social use of technology and not the technology itself. Rather than stifling the developemnt of search technologies or other supposedly "authoritarian" tech, we should be adding to the debate about what kind of a society we live in.
I will be writing a variant of this for a controversial website [adequacy.org] soon, in support of rigidly restricted appliance computers and limited-access proprietary content AOL style networks developing alongside the open Internet. In this society we have prisons, in which the prisoners can't use the Internet much because the software and hardware that would allow them to use it within prison rules (reliably, monitored by non-technical prison officials) does not exist.
I would rather the educational and self-betterment resources available on the Net be extended to prisoners with the blessing of prison officials, so prisons which have lost their education budgets can restore these services cheaply.
Digital Evidence Software (Score:5, Interesting)
I am currently the designer and project lead for a cross-platform open source (GPL) digital evidence processing suite. It is intended to bring together the various functionalities required to perform this type of work, and (ideally) operate on whatever platform the investigator desires. Our primary development platform is RedHat 7.1.
There are currently software packages out there that attempt to do this, including EnCase [guidancesoftware.com] and The Forensic Toolkit [accessdata.com] in the commercial arena and The Coroner's Toolkit [porcupine.org] in the open source arena, however they lack the broad filesystem support and/or true ease of use to make them usable by everyone. The other barrier is price as EnCase, for example, costs thousands of dollars per copy.
We're well funded, and have already done a significant amount of work. We have some of our core components functional and plan on starting beta testing and releasing our first code drop later this year. If this field interests you and you'd like more information, or you work in the investigative field and have thoughts on what you'd like to see in such a tool, I'd love to hear from you.
Re:Digital Evidence Software (Score:1)
Enjoy.
Spawing another industry (Score:2, Interesting)
I'm not saying it's the right thing to do, or even that it's legal, but since most companies are not even aware of what data they do have backed up, and what is retrievable and what isn't, I could see this happening, if it hasn't already.
Re:Spawing another industry (Score:1)
Obligatory Netscape story [jwz.org].
Obligatory text to avoid the postercomment compression filter. I'm beginning to think that the trolls are right about Taco not being able to code; considering how much ASCII art I've seen in the last couple weeks, it's amazing that my little bit of HTML won't fly...
wow, that's all i can say (Score:1)
Admin - www.newspad.org [newspad.org]
NewsPAD - the daily news source for geeks!
Other application of this tech: viruses! (Score:3, Insightful)
No wonder litigation is so expensive (Score:1)
About a half dozen lawyers spent weeks at a time in the period leading up to the trial in 1998 wading through through many thousands of pages of printed electronic documents. "It was a lot of paper," said one former government lawyer who worked on the case.
Which explains a lot about why litigation is so expensive! :) What I find humorous, being in the jury pool for our county's Superior Court, is that we (as a society) can afford to pay a half dozen lawyers to sit around poring over printed e-mails, but can't afford to validate parking for jurors. Assuming you can prove the validity of your evidence, I can see how a method to automate this would be very attractive.
Forensics must be "admissible" to be useful (Score:4, Informative)
As far as user-friendly interfaces for forensic-ware, and other suggestions by comment-posters for improving the technologies, don't forget that in order to be useful to a lawyer, digital forensic evidence must be admissible in a court of law. Nobody is going to settle a lawsuit based on some damning piece of deleted email recovered from their hard drive, unless you convince them that the jury trying their case is going to see a big blow-up poster of all the bad things they said in it. In order to get that recovered data into evidence (at least in the USA), the lawyer must "lay a foundation" that the evidence has some reliability. An eyewitness to an event, for example, can testify about things she was able to see or hear from her particular location, but her testimony about what might have been happening out of her eye-earshot is not admissible in court. Another way to lay a foundation is through a qualified expert opinion, for example, an accident reconstruction expert who measures the skid marks and applies a scientific method to determine whether the car was speeding before the accident. The point being, even if I as a lawyer could read up on the relationship between skid marks and vehicle speed, make those measurements on my own, and perform the calculations just as accurately, that would not do me a bit of good. I would still have to go out and retain someone with considerable expertise in such matters in order to get the court to admit the results of the calculations into evidence, or I never get to put them on my blow-up poster for the jury. And this is not just a gimme. Especially in federal court, there are specific criteria for the qualifications that an expert must have, and the demonstrated reliability of the expert's method, before the results can be admitted in court.
So for those of you who are devising tomorrow's user-friendly forensics - a warning. No matter how point-and-clicky you make them, my lawyer colleagues and I will likely never touch them. Even though I am technically literate enough to grep anything you can grep, I'll keep on hiring one of you technical experts when I need some digital forensics done, because I need your experience, credentials and signature to convince a court that the results are reliable and not just wishful computer hokey-pokey by a lawyer who wants her client to win. (Also, lawyers don't testify in their own cases, as a rule, for various reasons.) This is especially true with things that *sound* somewhat unreliable, like recovering from low-level formats and such. The more extrapolation and guesswork is involved in the "recovery," the less likely it is to get into evidence.
And if you're developing a search method, or some other new technique for data recovery, keep in mind that in order to qualify yourself and the technique as proper expert testimony, you're likely going to have to disclose quite a bit about how you did it in order to lay the foundation for admissibility. You can just throw those valuable little trade secrets and patentable methods out the window. That's another reason why legal tech forensic shops tend to rely on things like grep and dd rather than innovating - where's the big payoff? Now if you don't care about admissibility, and are just mining the hard drives of your ex-employees (or ex-spouses, or whatever) for business reasons, maybe that's a different story. But most people don't think they're about to get into a lawsuit until it happens, so I wouldn't be so sure.
hit load of links. (Score:1)
This is where Autonomy started (Score:2)
From that original start, they then (allegedly) gained the interest of the intelligence services,and then the media companies and dot coms, to become the players they are today.
NSA Line Eater (Score:1)
Don't lawyers head DMCA? (Score:2)
I don't think the DMCA is a good idea. In fact, I think it sucks. The best way to defeat that trash legislation is to hold EVERYONE, especially the legal community, to the letter of the language.