[mono-vb] WebCrawler in vb.net (mono)

quandary quandary82 at hailmail.net
Thu Feb 18 14:17:48 EST 2010

I've wanted to do that a long time ago.

You can take a look at Apache Lucene, a Java search library, which you
could port to .net.
Perhaps you find a way to compile the lucene library from java
source/bytecode directly to .net.

Another way is to extend this codeproject project:

Then you need a ranking algorithm, such as Google PageRank, or perhaps
better something like Yahoo TrustRank, and a parallel computation
library, and a cluster software for computing the Eigenvectors of the
markov chains (indexing).

I found this site about PageRank to be particularly useful because of
it's incredible simplicity:

On 02/17/2010 03:21 PM, Mauro Risonho de Paula Assumpção wrote:
> I am developing an open source software, which need a web crawler. I
> would like help from the list. The idea is to scan the structure of
> the site (HTTP and HTTPS), riding in a treeview in vb.net
> <http://vb.net> with GTK (Mono). Does anyone have any ideas?
> Thanks
> _______________________________________________
> Mono-vb mailing list
> Mono-vb at lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-vb

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.ximian.com/pipermail/mono-vb/attachments/20100218/6c2dd5e7/attachment-0001.html 

More information about the Mono-vb mailing list