About
We are hiring - if you're looking for a job…
Activity
-
I am proud that CuspAI closed a $100M Series A fundraise this week. Chad and I started this journey only a year ago to change the world with an AI…
I am proud that CuspAI closed a $100M Series A fundraise this week. Chad and I started this journey only a year ago to change the world with an AI…
Liked by Alex Smola
-
The new semester is here at CMU, excited to co-teach with Tim Dettmers, to offer our fun course on "Build Your Mini-PyTorch (needle) from scratch…
The new semester is here at CMU, excited to co-teach with Tim Dettmers, to offer our fun course on "Build Your Mini-PyTorch (needle) from scratch…
Liked by Alex Smola
Experience & Education
Publications
-
Discovering Geographical Topics in the Twitter Stream
The proceedings of the 21th International Conference on World Wide Web (WWW 2012)
Micro-blogging services have become indispensable communication tools for online users for disseminating breaking news, eyewitness accounts, individual expression, and protest groups. Recently, Twitter, along with other online social networking services such as Foursquare, Gowalla, Facebook and Yelp, have started supporting location services in their messages, either explicitly, by letting users choose their places, or implicitly, by enabling geo-tagging, which is to associate messages with…
Micro-blogging services have become indispensable communication tools for online users for disseminating breaking news, eyewitness accounts, individual expression, and protest groups. Recently, Twitter, along with other online social networking services such as Foursquare, Gowalla, Facebook and Yelp, have started supporting location services in their messages, either explicitly, by letting users choose their places, or implicitly, by enabling geo-tagging, which is to associate messages with latitudes and longitudes. This functionality allows researchers to address an exciting set of questions: 1) How is information created and shared across geographical locations, 2) How do spatial and linguistic characteristics of people vary across regions, and 3) How to model human mobility. Although many attempts have been made for tackling these problems, previous methods are either complicated to be implemented or oversimplified that cannot yield reasonable performance. It is a challenge task to discover topics and identify users' interests from these geo-tagged messages due to the sheer amount of data and diversity of language variations used on these location sharing services. In this paper we focus on Twitter and present an algorithm by modeling diversity in tweets based on topical diversity, geographical diversity, and an interest distribution of the user. Furthermore, we take the Markovian nature of a user's location into account. Our model exploits sparse factorial coding of the attributes, thus allowing us to deal with a large and diverse set of covariates efficiently. Our approach is vital for applications such as user profiling, content recommendation and topic tracking. We show high accuracy in location estimation based on our model. Moreover, the algorithm identifies interesting topics based on location and language.
Other authorsSee publication
Patents
-
Data Mining Unlearnable Data Sets
Issued US 20080027886
This invention concerns data mining, that is the extraction of information, from "unlearnable" data sets. In particular it concerns apparatus and a method for this purpose. The invention involves creating a finite training sample from the data set (14). Then training (50) a learning device (32) using a supervised learning algorithm to predict labels for each item of the training sample. Then processing other data from the data set with the trained learning device to predict labels and…
This invention concerns data mining, that is the extraction of information, from "unlearnable" data sets. In particular it concerns apparatus and a method for this purpose. The invention involves creating a finite training sample from the data set (14). Then training (50) a learning device (32) using a supervised learning algorithm to predict labels for each item of the training sample. Then processing other data from the data set with the trained learning device to predict labels and determining whether the predicted labels are better (learnable) or worse (anti-learnable) than random guessing (52). And, using a reverser (34) to apply negative weighting to the predicted labels if it is worse (anti-learnable) (54).
Other inventors -
-
Term Weighting for Contextual Advertising
Filed US 20110093331
A contextual advertising system selects online advertisements for display on a network location. The system may transform page content of a page received in a platform over a network into a textual representation. In addition, the system may transform received site content of a site into a site signature. The site includes the page. The system then may correct the textual representation utilizing the site signature to produce modified textual representation. The system may utilize the modified…
A contextual advertising system selects online advertisements for display on a network location. The system may transform page content of a page received in a platform over a network into a textual representation. In addition, the system may transform received site content of a site into a site signature. The site includes the page. The system then may correct the textual representation utilizing the site signature to produce modified textual representation. The system may utilize the modified textual representation to select an online advertisement. Considering a page in the context of the entire website to which it belongs leads to better understanding and interpretation of the page topic(s) and thus yields more accurate ad matching.
Other inventors
Languages
-
English
Native or bilingual proficiency
-
German
Native or bilingual proficiency
-
Italian
Professional working proficiency
-
Spanish
Limited working proficiency
-
French
Limited working proficiency
More activity by Alex
-
Check out Higgs Audio v2: https://lnkd.in/g8-B4_ag Kudos to the modeling team members involved in this release: Martin Ma, Dongming Shen, Ruskin Raj…
Check out Higgs Audio v2: https://lnkd.in/g8-B4_ag Kudos to the modeling team members involved in this release: Martin Ma, Dongming Shen, Ruskin Raj…
Liked by Alex Smola
-
On behalf of Fellows Fund's 30+ fellows and 10k+ AI community members, we're thrilled to welcome Zachary Lipton as our newest Distinguished Fellow at…
On behalf of Fellows Fund's 30+ fellows and 10k+ AI community members, we're thrilled to welcome Zachary Lipton as our newest Distinguished Fellow at…
Liked by Alex Smola
-
With deep appreciation for an amazing three decades at Microsoft, rich with challenges, learning, and lifelong friendships, I’m starting an exciting…
With deep appreciation for an amazing three decades at Microsoft, rich with challenges, learning, and lifelong friendships, I’m starting an exciting…
Liked by Alex Smola
-
Boson AI's new audio generation model is out with some killer features: emotionally expressive voice, long form audio, multi-speaker voice cloning…
Boson AI's new audio generation model is out with some killer features: emotionally expressive voice, long form audio, multi-speaker voice cloning…
Shared by Alex Smola
-
TRI's "LBM 1.0" paper appeared on arxiv last night! Large Behavior Models (LBMs) are foundation models for robots that map robot sensors (notably…
TRI's "LBM 1.0" paper appeared on arxiv last night! Large Behavior Models (LBMs) are foundation models for robots that map robot sensors (notably…
Liked by Alex Smola
-
Excited to see that GigaPath was used in the first real-world, real-time clinical deployment of digital pathology foundation models for precision…
Excited to see that GigaPath was used in the first real-world, real-time clinical deployment of digital pathology foundation models for precision…
Liked by Alex Smola
-
Very excited to have Lindsey Allen on board as Boson's first dedicated Chief Product Officer. She's bringing along decades of experience with…
Very excited to have Lindsey Allen on board as Boson's first dedicated Chief Product Officer. She's bringing along decades of experience with…
Shared by Alex Smola
-
After nearly eight years at NVIDIA, I’m excited to share that I’ve started a new chapter. I still vividly remember meeting Jensen Huang at CVPR 2017…
After nearly eight years at NVIDIA, I’m excited to share that I’ve started a new chapter. I still vividly remember meeting Jensen Huang at CVPR 2017…
Liked by Alex Smola
Explore top content on LinkedIn
Find curated posts and insights for relevant topics all in one place.
View top content