I have become a big soup freak . Its been one of the fads thats infected me for the last year
This is a really easy recipe for all you folks who arent fussed about sitting by the stove all day and comes out pretty darn nice
Take a couple of potatoes cube em
Zuccini *2 remove skin and cut coarsely
onions *1 slize
garlic *2 slice
Procedure
1)roast onions and garlic
2)put zucinni onions potatoes and garlic in some vegetable stock
3)cook till tender
4)then out comes the trustee hand blender
5)blend away add some pepper and salt
6)bring to boil
7)fry some sage with butter and add on top
serve with crusty bread
This is my space on the world wide web where I share my ideas ,thoughts and other nonsensical things that go through my mind...
Friday, March 25, 2011
How to detect language of a document
William B. Cavnar and John M. Trenkle from Michigan AnnHarbour
Defn Worth a read...
Firstly what is a ngram ?
An Ngram is an n character slice of a string(From the paper verbatim)
so for APPLE you will have ngrams _,A,P,L,E then _A,AP,PL,LE,_AP,PLE etc
The basic algorithim if you dont have the patience to read this paper is
1)Create a ngram based profile for a document i.e this is basically finding the frequency of occurances of all the NGrams in your language document
2)Sort this ngram based profile with the highest frequency on top this would tell you the most occuring ngrams.
3)Now if you were to find the language of origin of a document then you will need to find its profile and then sort it by highest frequency
4)Now find a minimum distance between these documents i.e if the document is like the language this should be very small so the frequency of occurance of the words/syllables in the document and language would be similar .
Subscribe to:
Posts (Atom)