Friday, 8 October 2010

Surfing History

I have been wondering how easy or difficult it would be to examine ones browsing history – say the number of times you’ve visited one site over another – social media vs email vs blogs etc.

It wasn’t as straightforward as I hoped. I fired up the Visual Studio Express after years of Excel documents and mind raping Power Points.

The first thing was to find out what .NET namespaces dealt with browsing history. And the first thing was a failure – as I couldn’t find ANY .

After much tedious googling, I found a sample at iok.net. It laid out everything quite clearly.

The main item here is the INTERNET_CACHE_ENTRY_INFOW structure.

    public class INTERNET_CACHE_ENTRY_INFOW
{
public uint dwStructSize;
public string lpszSourceUrlName;
public string lpszLocalFileName;
public uint CacheEntryType;
public uint dwUseCount;
public uint dwHitRate;
public uint dwSizeLow;
public uint dwSizeHigh;
public System.Runtime.InteropServices.ComTypes.FILETIME LastModifiedTime;
public System.Runtime.InteropServices.ComTypes.FILETIME ExpireTime;
public System.Runtime.InteropServices.ComTypes.FILETIME LastAccessTime;
public System.Runtime.InteropServices.ComTypes.FILETIME LastSyncTime;
public IntPtr lpHeaderInfo;
public uint dwHeaderInfoSize;
public string lpszFileExtension;
public uint dwReserved;
}


In addition, the STATURL structure gives you information about the site itelf



    struct STATURL
{
public static uint SIZEOF_STATURL =
(uint)Marshal.SizeOf(typeof(STATURL));

public uint cbSize;
[MarshalAs(UnmanagedType.LPWStr)]
public string pwcsUrl;
[MarshalAs(UnmanagedType.LPWStr)]
public string pwcsTitle;
public System.Runtime.InteropServices.ComTypes.FILETIME ftLastVisited,
ftLastUpdated,
ftExpires;
public uint dwFlags;
}


Once you have these, its just a question of looping through the History folder, like so



             
IUrlHistoryStg2 theHistory = (IUrlHistoryStg2)new UrlHistory();
IEnumSTATURL vEnumSTATURL = theHistory.EnumUrls();
STATURL vSTATURL;
uint isFetched;
while (vEnumSTATURL.Next(1, out vSTATURL, out isFetched) == 0){
Console.WriteLine(string.Format("{0}:{1}\r\n",vSTATURL.pwcsTitle, vSTATURL.pwcsUrl));
}

The next thing would be to see how this can be worked into an Excel template that charts the number of pages you visit in each site.

No comments:

Post a Comment