Example

Using of LinqToWeb integration consists of two parts.
  • First you have to create new .webml description file using LinqToWeb item template (Add -> New Item). Here you describe the abstract structure of information and an algorithm used to obtain the data from web. The algorithm is just a pseudo-description of the process (in the same way, as user browses Internet). It can also integrate data from several different web sites, exposed under the same abstraction. It will be used to generate C# implementation that will collect particular data lazily just when they are needed in the client application.
/* Search.webml */
//
// Classes declarations.
//

class Result
{
	string title;
	string url;
}

class QueryState
{
	string info;
}

//
// Helper C# methods.
//

string urlencode(string str) c# @'
	return System.Web.HttpUtility.UrlEncode(str);
'
string htmldecode(string str) c# @'
	return System.Web.HttpUtility.HtmlDecode(str);
'

//
// Extraction methods.
//
// Methods named "main" cannot be called explicitly, they represent the entry points of extraction.
// Arguments of main methods can be both value-types or references to user defined classes and lists.
// - Values of value-type arguments are automatically initialized within the context class constructor.
//   They should be used as read-only variables.
//   Value types: string, int, bool, double, datetime
// - Object and list arguments represents the results of extraction. They are automatically exposed as properties of context class.
//   By accessing these properties, extraction is executed if some unknown value is requested or a list is enumerated.
//   They should be used as write-only variables.
//

main(Result[] BingResults, QueryState state, string searchQuery)
{
	[open("http://www.bing.com/search?q=" + urlencode(searchQuery))]
	{
		bingpage(BingResults);
		searchinfo(state);
	}
}

// extracting from bing results page
bingpage(Result[] items)
{
	foreach(xmlmatch(@'
		<li class="sa_wr">
			<div class="sa_cc">
				<div class="sb_tlst">
					<h3><a href="~@rhref@~">~@rtitle@~</a></h3>
				</div>
			</div>
		</li>
	'))
	{
		items[] = Result(url=htmldecode(rhref),title=rtitle);
	}
	
	foreach(xmlmatch(@'<a href="~@rhref@~" class="sb_pagN"></a>'))
		[open(htmldecode(rhref))]bingpage(items);
}

searchinfo(QueryState state)
{
	foreach(match(@'<span class="sb_count">~@strstat@~</span>'))
		state.info = strstat;
}
  • The integration installs code generator, that automatically generates C# proxy class with implementation of your .webml algorithm. You can then immediately use them in your C# code.
/* Program.cs */
var context = new Search("phalanger");
foreach (var x in context.BingResults)
    Console.WriteLine("{0}; {1}", x.url, x.title);

Last edited Jan 13, 2012 at 1:52 PM by jakub, version 3

Comments

No comments yet.