The Works

Monday, February 20, 2006

Next Gen navigation and wayfinding

Have you ever wondered if we could have a guide and an assistant to provide us with the info we want, with us 24/7 at all places that we go.
We worked on an RFID application for guiding visually impaired persons inside buildings using active RFID tags, direction sensors and semantic dialog delivered on wi-fi enabled portable devices. This technology of using RFID for position sensing is better because it does not shut off indoors like the GPS, plus its cost effective and it has a very high resolution as compared to a GPS solution. Most sophisticated RFID readers cost around $20 a piece and the tag itself is upto 50 cents. So our motto "Go, Tag the world".
Enactment of the proposed project would result in a model and prototype of a hand-held device for visually impaired navigation of unfamiliar environmental spaces through wireless sensing and communication. The deliverables would consist of a handheld device, accessories necessary for sensing and tracking such as the RFID tags, RFID readers, an electronic compass, server software for data communication, text-to-speech software, an earpiece and a wristband. We will also deliver all requisite algorithms for optimum design using the components provided, a model for their implementation, and detailed documentation of all implemented solutions.

The active RFID tags, readers and the electronic compass will be used for locating a user and tracking their progress towards any destination. On the other hand, the handheld PDA, earpiece, and the haptic wristband will be used to guide the user and manage the navigation dialog between the user and the navigation system.
In the retail environment, each item will also have its own passive tag and a bar-code so that an RFID reader attached to the handheld PDA or a barcode reader can be used to retrieve item information such as the price, description and availability upon demand.
The system will be designed to perform the following functions :

Mapping and localization : The items in the buildings e.g. retail space will be tagged by the low cost RFID tags. A high power portable Reader with a typical read range of 1-3 feet will be utilized to map all the products in the space. The active tag carried by the user will transmit an ID which will be used by the RFID network connected to a central server for the localization of the user using established techniques like TDOA or Signal strength function.

Guidance : The guidance is done after receiving the guidance data on the PDA which is constantly being updated by the server in communication with the localization module. The guidance commands are relayed to the wristband of the user in the form of vibrations in the haptic wristband. So if the user has to be indicated to move to the right, the right part of the wristband vibrates. The user can activate the wristband input pressure sensors/switches to explore the surroundings. So if she wants to know what is to her left she presses the left switch on her wristband and the voice commands will speak out the exploratory dialog.

Dialog Generation: This subsystem will manage the assistive navigational dialog through a multi-modal interface comprising natural language, acoustic cues, onomatopoeia and sensory vibratory.

Tuesday, November 01, 2005

USuggest

Aren't you tired of seeing the ads for retail products all over the web? Tired of sifting through millions of search results when finding products online ? Ever wanted a shopping site driven by the users ? A shopping site that contains deals from all over the web and not just a handful of vendors like the comparative shopping engines ? Ever wanted all the information you wanted about a thing at one place ? Ever wanted to talk about and share user experiences web-wide for a particular line of consumer product ?
Now imagine all this and more at one Place. This is the driving vision behind USuggest.

Following the 80/20 rule, it has been noted that 80 percent of web users search for 20 percent of searched items. The terms people type into search Internet engines everyday are called “hot tags”. Hot tags are the set of phrases that are frequently searched. For example, in conducting searches online for specific products hot tags might be: portable humidifiers, red rugs, garden lighting, etc. Each of these phrases is a hot tag.

Tags constitute a dynamic set of markers that depend on a number of factors. When entered into a search engine, tags generate returns based on the algorithms used by the particular search engine. No two search engines will produce the same returns. Worse, there currently is no automated solution to the problem of the relevance of results returned for hot tags.

People searching for items while shopping online fare no better than anybody else when trying to generate relevant results using standard search engines. For example, a person who enters a tag such as “garden lighting” will generate thousands of resulting “hits”, but there is no quick or automatic way to sort the list for personal relevance. The ideal would be to find the entire garden lighting range of products available on the web in one location, and also be able to determine which garden lighting items other people are buying.

Tags act as the pivot points on the web and, based on their relevance to other online searchers, they attract web traffic. The USuggest.com tagging technology is both passive and active. Passively, the system tracks and stores the tags a user enters during a search event to find relevant items. The USuggest technology also uniquely encourages the suggester actively to add additional tags that may help identify the items they are suggesting to others.
As as result of the active tagging option, online sellers (suggesters) get more exposure for the inventory of suggestions they load in to USuggest.com. The suggestors who suggest items under a tag are rewarded for each sale of the underlying item.
To make deal suggestion as simple as possible, USuggest has developed a unique and proprietary toolbar that enables a simple process for uploading deals to USuggest.com. When a deal is located on the web, the finder merely pushes a button on their toolbar and the deal is instantly loaded on Usuggest. If a person attempts to post a deal that has previously been suggested, they will receive a message informing them that the deal is already posted.

Deal suggesters are automatically prompted to provide other descriptors (tags) that help to identify the underlying product or service. Thus, shoppers at the USuggest.com site can locate their desired items using a wide range of search terms. For example, individuals seeking running shoes can be more specific and search for “blue running shoes” or “blue women’s running shoes”. These more specific search tags are added by deal suggesters to enhance their opportunities for earning commissions. Significantly, the deal suggester is acting on personal incentives (the commission based cash reward) to suggest deals that are designed to appeal to shoppers. Once posted to the USuggest.com website, a deal will be ranked according to the number of people who click through and actually purchase the underlying product or service. Thus, deals will be ranked according market behavior. Deals listed on USuggest will provide shoppers with the confidence of knowing that other like-minded shoppers have selected the same product. Marketers have long recognized that word-of-mouth is a powerful influencer of the buying decision. Clearly, USuggest will help minimize the time a person needs to spend shopping for desired products/services.
USuggest.com’s dual incentive system: search optimization for shoppers, and cash rewards for suggesters, is unprecedented.
USuggest is currently at an angel funding candidate stage, but we have backup plans in place to take it to the next level slowly and steadily.

Here is a sample page in Usuggest.

Forex Advisor

Forex advisor is the company that came out of my efforts in the summer of 2005. I was researching on different investment options after being spammed to death by get-rich-quick emails. Painfully there was no such path, but it enlightened me to varied areas like real-estate investment, Stocks, Futures, Options and of course foreign exchange. The stocks were a natural advise of the veteran investors and that is where I started. But the realization came quick that it wasn't possible to earn a steady income from the fluctuations in stocks unless I had a triple digit account balance to play with. Forex was a boon to a person like me. An unprecedented leverage, 24 hr market, High liquidity, Macro level fundamental dependence were some of the factors that got me started. The technical analysis study I did for stocks just came in handy only that it was more applicable in forex. I day traded and blew up my account before realizing what veteran traders meant by "lizard brain" and "emotional trading". But I learnt quite a lot in the process, devised some strategies and then sought to automate them. I was in talks with Tradestation tm when Metatrader came with its trading software for forex and I learnt the MQL 4 to programmatically implement my mechanical strategies. The experiment was successful and I started playing with the real money. I started sending my signals to a closed group of friends and after they were convinced, they came forward and put their own money into play too. Seeing the profits, there was a lot of insistence from this closed group to take in more money and that is when 4xadvisor was formally born. The company runs like a limited subscription fund and trades for the clients in lieu of a percentage of the gains as its commission. There are an array of technical indicators that we play with. The technical analysis strategies are throughly backtested before being deployed on a demo account and after a consistent performance are shifted onto the real accounts. Its an exciting mix of computer science and financial analysis performing amazingly well.

Cure-Osity

Collaborative Curation

This system was developed for Dr Chitta Baral at Arizona State University and is nicknamed CBIOC.
It is an inexpensive and scalable approach for curation that takes advantage of automatic information extraction methods as a starting point, and is based on the premise that if there are a lot of articles, then there must be a lot of readers and authors of these articles. It provides a mechanism by which the readers of the articles can participate and collaborate in the curation
of information.
Besides the data that exists in various public and private databases, there is a much larger and ever increasing amount of information buried in existing biomedical articles. It is beyond human ability to read the various relevant articles and recall relevant findings of these articles for further research. Therefore, it becomes clear that the findings in these articles have to be culled and stored in a database such that the data can be integrated with other existing databases.
The System is designed to be a web browser sidebar assistant for the biotechnologists. The system mines the rules and facts from the text data and takes user input for curation and validation. It relies on the fact that the collective judgement of people is always better than any algorithmic knowledge representation and collection approach. When the biologists are conducting searches or browsing biological databases like pubmed they can have the CBIOC sidebar open, which constantly displays the facts contained in a particular abstract. These facts have either been automatically extracted and validated or have been added by other users. The curated and summarized data is available as a worldwide knowledge base. ( www.Cbioc.org )

E-Commerce venture

After hardcore Windows system development I decided to try something different and I got into e-commerce systems. I helped my former advisor IV Ramakrishnan, in his start-up company to setup the systems and processes for a building a large scale retail products portal publishing system which earned revenues through affiliate marketing. The lessons learnt here were helpful in USuggest.
There I also Developed a Focused Crawler for collecting product data from online vendors. The system obtains and maintains the metadata by controlled asynch web service requests. This web Service data is bootstrapped with an RE extractor and crawler for finer attribute extraction. We crawl Amazon with this system extensively to get the fresh data and attributes not provided by the web service data.

BioLog

This work was done as a part of a consulting contract with Arizona based Genomics Research company - TGen ( www.tgen.org ).
We often realize that communicating with other colleagues who are studying similar topics helps to identify information relevant to our area of study, which otherwise may not have been found. Hence, there have been many organizational efforts and a variety of tools produced to support sharing of knowledge, as well as data, within communities of shared research areas. The collective knowledge of sets of experts is different from the massive, general, text archives of information that we typically rely on since it is limited to a particular realm of findings. It is further different in that it reflects the experts' current models of what that field suggests and it is dynamic, and constantly changing as a result of researchers search activity. While data sharing among experts is improving constantly, model extraction and sharing has not improved. Biolog aims to accelerate acquisition of collective knowledge in well defined areas by identifying specific spheres of inquiry and corresponding groups of people. It also provide a systematic way to gain knowledge from their online search activity, and enable them to organize and share their findings for further analysis.
Biolog implemented Biological content capture, taxonomy building, dynamic querying, Query By Example, query archival, Taxonomy/Query sharing, indexing and keyword based searching.

Monday, October 31, 2005

AppliedE

. After my MS from Stony Brook the start-up bug bit me and I joined a start-up called AppliedE as its Chief Technology Officer. I developed an Enterprise Tracking system, mining client activity logs for billing, reporting and auditing purposes. The software was targeted for the legal space, where the lawyers have to maintain time sheets for the research work they do for their clients. This software consisted of a application plug-ins and a .net GUI with a SQL Server backend to automatically track the websites visited and the time spent on each, to generate a comprehensive report for the clients. It also used to track the application windows ( like word, excel etc ).

. Architected and Developed a patent pending Research Collaboration Tool – Partner Online™ v1.1 having an organizational search engine, annotation sharing capability, automatic alert generation and content relevance ranking mechanism. This was a natural evolution of the fine grained tracking capability that we developed. I realized that all the tracked data could be used to enable collaboration in an enterprise.

. Designed a desktop Search and Archival system providing multi-parameter querying and novel search visualization techniques for web data recollection. I designed a system similar to the desktop search systems that came a year later, but I still fel my system had some unique capabilities. Firstly we were caching and archiving pages that change. e.g. cnn home page is never the same if you access it after a day. So if you remeber you saw something on the front page, it might not be there unless you have an archived copy. To present the different archived copies we had a visual interface that used to show the thumbnails of the search results to aid the user. Apart from that we used to provide the whole logical click-stream and not just the search result to aid the user in knowing how he ended up in a particular page and what he saw along the way. The user could search by dates, page titles, keywords etc.

. Designed a personal data management system for user customization of web content by real-time annotation (highlights, notes, categorization etc) of web pages through a toolbar in MS IE. My love for the Internet Explorer Plugins went a step further when we discovered how we could change the rendered page in a browser and put annotations on the page itself. The annotations were stored on a document server with the summary and any user notes and all this info was searchable throughout the enterprise.

. I Spearheaded the product development for Appliede, hired 5 developers and completed the beta development in a 6 months timeframe. This was when i cultivated my love for entrepreneurship while pitching the ideas to the potential investors. The hectic days entailed train rides in NYC, presenting to investors on the wall street during the day and development during the night. We attracted angel investment to the tune of $700K and most of my invented technologies lead their way to patent filings. PartnerOnline bagged the LISA (Long Island Software Awards) pitching against companies like CA, Northorp Grumman and Infosys.

Used VC++, WIN32, BHO, Named Pipes, ATL, COM, ODBC, MySQL, Crystal Reports, C# and XML.

Stony Brook University and Winagent

· WinAgent is a Browser based (IE Toolbar), Intelligent, Reusable and User configurable Web Mining / Web Agent Builder tool, using Machine learning based Wrapper Generation principles. The tool implements learning by example, DOM extraction and correlation and Agent Logic generation.
Have you ever wanted to just crawl or grab the information on a webpage and store it as a list or other structured information. Well that is exactly what WinAgent does.
It is an intutive system wherein you have a toolbar on your web browser. When you push the record button on a web page from which you want to extract the info (lets say a page with a list of mp3 players) after highlighting a single product, the system pops up a dialog and shows you exactly what it can extract. You press "OK" and now you have an agent specification(in form of an XML file) which can be used on that site to extract the info from the pages. We have successfully crawled the products from amazon, buy.com and Zappos to demonstrate the effectiveness of the system. The system is based on a wrapper learning technique which stores the partial XPaths as the Agent specification. The algorithm works on the XML page generated after converting the HTML using the MSXML parser. When the extractor is run these partial XPaths fed to an extraction algorithm to correctly identify the attributes of items in a list or group. It is better than the GATech's XWrap system in the sense that its easier to use. The system has been highlighted in various top-notch conferences like the IUI and ISMB. It was all programmed in VC++/COM and is therefore a very fast system working in real time as the user interacts with his/her browser.

· Designed the Wrapper / Agent Logic execution Interpreter using an automatic form filling and navigation plug-in for IE. The agent specification generated by the Winagent system needs to be executed for it to be able to crawl and extract information from the web pages. The system takes in the XML file and works directly on the DOM grabbed from the web browser. It automatically traverses the links and fills up any forms if necessary to reach to the target page and then fire the information extractor to get the records from the web page. This tool was made in VC++ based GUI system with a web browser COM component embedded in it.

· Designed a Multithreaded, XMLizing WebCrawler for the Agent tool. The agent execution system built above was hit by a bottleneck of loading the web pages in the embedded browser before being able to run the information extractor. So I designed a crawler which used the WINHTTP to make multithreaded calls for crawling the pages and rapidly convert them into XML using the MSXML. This data is then passed to the extraction algorithm to get the records from th page.

· Pioneered Windows Development in the Logic lab resulting in WinAgent system, used by major concerns like NASA, BNL, TGEN. I introduced the VC++/COM programming in the LMC lab at Stony Brook university for high performance data extraction in real time through intutive user interactions. Lead a team of 12 research assistants on the Winagent project. The Winagent system has been extensively used to mine web sites for collecting archival data for the libraries, building toxin and biological agent databases from online sources of information, information extraction from online catalogs etc.

We used Used VC++, WIN32, MSXML, MSHTML, Xpath, WINHTTP, COM, ATL, DOM and JavaScript for developing Winagent.

Sunday, October 30, 2005

Quark Work

· I joined Quark after my bachelors degree and Designed a Flash Filter for Quark documents demonstrated by Quark CEO at Seybold 2000. I designed a module for the Quark Xpress 5 which handled the display of Flash in QXDs (Quark Xpress Documents). It can be imagined similar to adding the Flash support to Microsoft Word. The good part of this project was the reverse engineering of the Flash format( in a hex editor of course ) and figuring out how to find the basic Flash parameters like the dimensions of the Flash to be displayed.

· Designed a Multithreaded Crash Diagnosis Mailing module for QuarkXpress.
One of the first dabbling of mine in the world of Commercial Software development. Imagine a fresh out of college graduate being thrown at him a 3000 line WIN32 code and asked to figure out why the module was not able to work in parallel to other tasks of the Application. I looked into various approaches and multithreading was the way to go. So I implemented a solution where an independent thread would take the diagnostic information or the "DUMP" and mail it to the company without affecting the main software behaviour.

· Resolved the highest number of bugs in QuarkXpress 5 beta.
This was one of the fun times at Quark. There was a team of about 25 people working tirelessly to release the QXpress 5 at time. The standard practices were very well implemented. We used the SilkRadar bug tracking system and some of the best industry practices were in play. The bugs discovered by the QA were automatically put up in the Queue after being assigned the priorities. The development team fixing the bug had these exotic colored meters on their desktops which showed who were the current high scorers. Of course there were penalities for breaking the daily build jokingly called the donut offence(the culprit had to treat the whole team to donuts). I remebered being at the top of the colored meter for 2 weeks before finally leaving Quark.

· Reorganized and maintained the Highest downloaded Quark module- Jabberwocky 1.0.
Quark JabberWocky was a funny little widget albeit a very useful one for the type-setters. At the push of a button it would fill up the selected space with text stories that would make a bit of sense. The way it worked was that it had a dictionary at the back and a set of grammar rules. It used to generate random sentences based on the grammar rules using the words from the dictionary. So if I had my name and "poor" as words in the dictionary the software would declare me a poor person in some sentence of the story.

· Obtained the highest Appraisal ranking in the Research and Development Team. It was a fun time when we were rushing to get out the QuarkXpress 5. The bug hunts were tracked using the Silk Radar system and I remember everyone dying to have the longest bar in front of their names for the number of bugs solved. Finally after the six months review process I had a good ranking and was nominated for advanced training in Denver, which I declined politely because by that time I had an been admitted to the grad school with full financial aid.

· Used VC++, WIN32, ActiveX and MAPI.

Saturday, October 29, 2005

Papers & Publications

· Chitta Baral, Prabhdeep Singh, Hasan Davulcu et al “BioLog: A Browser Based Collaboration and Resource Navigation Assistant for Biomedical Researchers” DILS 2005, San Diego.

· Chitta Baral, Hasan Davulcu, Prabhdeep Singh et al “Collaborative Curation of Data from Bio-medical Texts and Abstracts and its Integration” DILS 2005, San Diego.

· Mike Berens et al “WinAgent: Creating and Managing Personal Information Assistants for Biological Data Collection” Software Demos, ISMB 2005, Michigan.

· Prabhdeep Singh, IV Ramakrishnan, Hasan Davulcu et al “Creating and Managing Personal Information Assistants via a Web Browser : The WinAgent Experience” Workshop on Information Integration on the Web (IIWEB-04) held in conjunction with VLDB 2004, Toronto.

· Prabhdeep Singh, IV Ramakrishnan, Hasan Davulcu et al “Creating Personal Information Assistants for Targeted Navigation and Extraction via a Web Browser” International Conference on Intelligent User Interfaces 2004(IUI 04), Portugal.

· Prabhdeep Singh, IV Ramakrishnan, Hasan Davulcu et al “WinAgent: A System for Annotating Web Documents by Targeted Navigation and Extraction”, International Semantic Web Conference 2003 (ISWC ‘03), Florida.

· Provisional Patent on “Research Collaboration System - knowledge archival and recollection system, for tracking user activity in a client server environment, archiving, retrieving and managing user knowledge, and extracting meaningful insights from the user activity data and archived knowledge”.

· Patent Disclosure on “A privacy-preserving method for co-browsing and community formation based on co-location on the Web”. Authors: Hasan Davulcu, Partha Dasgupta and Prabhdeep Singh.

· Patent Disclosure on “A Suggestion-Portal(tm) Creation and Management Method for Online Products and Services." Authors: Hasan Davulcu, Prabhdeep Singh, Thomas Duening.Prabhdeep Singh, “Crawling using WINHTTP 5”, “Web Data Extraction by crawling through WINHTTP and DOM instantiation”, www.codeguru.com, www.codeproject.com