[mono-vb] WebCrawler in vb.net (mono)
quandary82 at hailmail.net
Thu Feb 18 14:17:48 EST 2010
I've wanted to do that a long time ago.
You can take a look at Apache Lucene, a Java search library, which you
could port to .net.
Perhaps you find a way to compile the lucene library from java
source/bytecode directly to .net.
Another way is to extend this codeproject project:
Then you need a ranking algorithm, such as Google PageRank, or perhaps
better something like Yahoo TrustRank, and a parallel computation
library, and a cluster software for computing the Eigenvectors of the
markov chains (indexing).
I found this site about PageRank to be particularly useful because of
it's incredible simplicity:
On 02/17/2010 03:21 PM, Mauro Risonho de Paula Assumpção wrote:
> I am developing an open source software, which need a web crawler. I
> would like help from the list. The idea is to scan the structure of
> the site (HTTP and HTTPS), riding in a treeview in vb.net
> <http://vb.net> with GTK (Mono). Does anyone have any ideas?
> Mono-vb mailing list
> Mono-vb at lists.ximian.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Mono-vb