Posted: November 13th, 2009 | Author: Alex | Filed under: Algorithms, Data Mining, python | No Comments »
In Finding the Frequent Items in Streams of Data [PDF], Graham Cormode and Marios Hadjieleftheriou discuss the frequent items problem and some of the algorithms that are used to solve it:
The frequent items problem is to process a stream of items and find all those which occur more than a given fraction of the time. It is one of the most heavily studied problems in mining data streams, dating back to the 1980s. Many other applications rely directly or indirectly on finding the frequent items, and implementations are in use in large scale industrial systems. In this paper, we describe the most important algorithms for this problem in a common framework. We place the different solutions in their historical context, and describe the connections between them, with the aim of clarifying some of the confusion that has surrounded their properties.
Some of the interesting bits here are that the data stream will easily contain millions (or billions) of items and the algorithm will typically only get to take one look at each item as it comes up in the stream.
Space-Saving
In this post I focus on the Space-Saving algorithm and provide an implementation in Python. Read the rest of this entry »
Posted: November 11th, 2009 | Author: Alex | Filed under: Artificial Intelligence | No Comments »
There is already lots to look forward to in terms of next year’s research at the intersection of Artificial Intelligence and social media.
ICWSM 2010 – the 4th Internationall AAAI Conference on Weblogs and Social Media will be at George Washington University, Washington, DC from May 23-26. The full proceedings for the previous two conferences are still available online. So are video recordings for both 2008 and 2009.
AAAI-10 will be in Atlanta from July 11-15, 2010. It will feature the AI and the Web Special Track. Likewise, the AAAI Spring Symposium, at Stanford from March 22-24, will have a Linked Data Meets Artificial Intelligence track.
IEEE Intelligent Systems has two special issues on social media topics on next year’s calendar:
-
Social Learning (July/August 2010):
This special issue will accept papers related to all aspects of learning and knowledge discovery based on the social Web. On one hand, many existing intelligent systems such as natural language processing, information retrieval and multi-agent systems can benefit from utilizing the social Web as an additional knowledge source. On the other hand, the social Web is also an emerging domain for new techniques and applications of intelligence systems. We solicit high quality research papers demonstrating challenging research issues, presenting state-of-the-art theories, techniques and showcasing successfully deployed applications.
- Social Media Analytics and Intelligence (November/December 2010). Paper submissions are still accepted until next May:
This special issue seeks innovative contributions to SM [social media] analytics and intelligence research. Contributions must show relevance (from an either methodological or domain perspective) to at least one AI subfield; we strongly encourage multidisciplinary research with substantive findings in real-world, context-rich settings. The issue will provide an integrated, synthesized view of the current state of the art, identify challenges and opportunities for future work, and promote cross-cutting community-building.
I am sure, I am missing lots of others – I will probably post about those, as I come across them over the coming months.
Posted: November 4th, 2009 | Author: Alex | Filed under: Uncategorized | No Comments »
I invest some of my free time in helping out at the Seattle chapter of the IEEE Computer Society. We have been focusing on organizing high-quality talks mostly with an emphasis on software engineering or computer science. A number of other activities are in discussion for the coming months as well.
I am happy to report two things this morning:
- Our website was recently revamped. It may seem like a small step, but the move to wordpress did get us away from a very static and increasingly hard to maintain design. This was a very useful (and overdue) update. A number of additional features are in the works to build this up as a more effective resource for organizational activities, as time allows.
- Mark it in your calendar: Steve McConnell will be presenting Secrets of world-class software organizations at 6:30pm on Thursday, November 19. This will be at Bellevue College, Building N – Room 201. If you are involved with software development, you are probably familiar with Steve McConnell’s work and maybe even have some of his excellent books in the office somewhere. This should be a very interesting talk.
I’m looking forward to the presentation and hopefully many more such events in the future. Any feedback with respect to other speakers/topics of interest is always appreciated, too.
Posted: October 24th, 2009 | Author: Alex | Filed under: email | No Comments »
In How to Beat Information Overload, Natha Zeldes, president of the Information Overload Research Group (IORG) discusses causes and effects of information overload, in particular as it pertains to email and reports on some solution approaches.
I think this is a problem that has only been getting worse over the past decade, probably much more so over the recent few years that have brought an ever increasing variety of online messaging services, social networks, and so forth.
In general, computer technology should save us time and make us more productive. If the opposite is happening, then there clearly are pain points in dire need of optimization.