-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
upload static files retrieved from uilab web server
- Loading branch information
0 parents
commit fca66e3
Showing
14 changed files
with
178 additions
and
0 deletions.
There are no files selected for viewing
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
<!DOCTYPE html> | ||
<html lang="en"> | ||
<head> | ||
|
||
<meta charset="utf-8"> | ||
<meta http-equiv="X-UA-Compatible" content="IE=edge"> | ||
<meta name="viewport" content="width=device-width,initial-scale=1"> | ||
|
||
<title> Understanding Editing Behaviors in Multilingual Wikipedia </title> | ||
|
||
</head> | ||
<body> | ||
|
||
|
||
|
||
<div class="container"> | ||
<h2>Understanding Editing Behaviors in Multilingual Wikipedia</h2> | ||
</div> | ||
|
||
<div class="container"> | ||
<h4><a href='http://uilab.kaist.ac.kr/members/suinkim/'>Suin Kim</a>, <a href='#'>Sungjoon Park</a>, <a href='http://www.scotthale.net/blog/'>Scott A. Hale</a>, <a href='#'>Sooyoung Kim</a>, <a href='#'>Jeongmin Byun</a>, <a href="http://uilab.kaist.ac.kr/members/aliceoh/">Alice Oh</a></h4> | ||
<p><a href="http://arxiv.org/abs/1508.07266">arXiv</a></p> | ||
</div> | ||
|
||
<div class="container"> | ||
<h4>Abstact</h4> | ||
<p> | ||
Multilingualism is common offline, but we have a more limited understanding of the ways multilingualism is displayed online and the roles that multilinguals play | ||
in the spread of content between speakers of different languages. We take a computational approach to studying multilingualism using one of the largest usergenerated | ||
content platforms, Wikipedia. We study multilingualism by collecting and analyzing a large dataset of the content written by multilingual editors of | ||
the English, German, and Spanish editions of Wikipedia. This dataset contains over two million paragraphs edited by over 15,000 multilingual users from July | ||
8 to August 9, 2013. We analyze these multilingual editors in terms of their engagement, interests, and language proficiency in their primary and non-primary | ||
(secondary) languages and find that the English edition of Wikipedia displays different dynamics from the Spanish and German editions. Users primarily | ||
editing the Spanish and German editions make more complex edits than users who edit these editions as a second language. In contrast, users editing the | ||
English edition as a second language make edits that are just as complex as the edits by users who primarily edit the English edition. In this way, English | ||
serves a special role bringing together content written by multilinguals from many language editions. Nonetheless, language remains a formidable hurdle to | ||
the spread of content: we find evidence for a complexity barrier whereby editors are less likely to edit complex content in a second language. In addition, we find | ||
that multilinguals are less engaged and show lower levels of language proficiency in their second languages. We also examine the topical interests of multilingual | ||
editors and find that there is no significant difference between primary and non-primary editors in each language. | ||
</p> | ||
</div> | ||
|
||
<div class="container"> | ||
<h4>Data</h4> | ||
<ul> | ||
<li>English : 1,678,585 Edited paragraphs <a href="en_edits.gz">Download</a> </li> | ||
<li>Spanish : 580,102 Edited paragraphs <a href="es_edits.gz">Download</a> </li> | ||
<li>German : 844,303 Edited paragraphs <a href="de_edits.gz">Download</a> </li> | ||
<li>Metadata <a href="metadata.gz">Download</a> </li> | ||
|
||
</ul> | ||
|
||
|
||
</div> | ||
|
||
|
||
|
||
</body> | ||
</html> |
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
<!DOCTYPE html> | ||
<html lang="en"> | ||
<head> | ||
<meta charset="utf-8"> | ||
<meta http-equiv="X-UA-Compatible" content="IE=edge"> | ||
<meta name="viewport" content="width=device-width,initial-scale=1"> | ||
<meta name="description" content="Joint Modeling of Topics, Citations, and Topical Authority in Academic Corpora (TACL 2017)"> | ||
<meta name="keywords" content="Topic modeling, authority, citation, scientific article, CORA, PNAS, Arxiv Physics, Citeseer"> | ||
<meta name="author" content="Jooyean Kim, Dongwoo Kim, and Alice Oh"> | ||
<title> Joint Modeling of Topics, Citations, and Topical Authority in Academic Corpora </title> | ||
<style class="anchorjs"></style> | ||
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css" integrity="sha384-1q8mTJOASx8j1Au+a5WDVnPi2lkFfwwEAa8hDDdjZlpLegxhjVME1fgjWPGmkzs7" crossorigin="anonymous"> | ||
<script src="../assets/js/ie-emulation-modes-warning.js"></script><!--[if lt IE 9]><script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script> | ||
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script><![endif]--> | ||
<link rel="apple-touch-icon" href="/apple-touch-icon.png"> | ||
<script>!function(e,t,a,n,c,o,s){e.GoogleAnalyticsObject=c,e[c]=e[c]||function(){(e[c].q=e[c].q||[]).push(arguments)},e[c].l=1*new Date,o=t.createElement(a),s=t.getElementsByTagName(a)[0],o.async=1,o.src=n,s.parentNode.insertBefore(o,s)}(window,document,"script","//www.google-analytics.com/analytics.js","ga"),ga("create","UA-146052-10","getbootstrap.com"),ga("send","pageview"); | ||
</script> | ||
<script id="_carbonads_projs" type="text/javascript" src="//srv.carbonads.net/ads/C6AILKT.json?segment=placement:getbootstrapcom&callback=_carbonads_go"></script><script type="text/javascript" src="//fallbacks.carbonads.com/home/e99a260b94849497ea962f674f0aebd9/?145128823"> | ||
</script> | ||
<style>.carbonad{display:block;background:#fdfdfd;background-image:-moz-linear-gradient(top,#f8f8f8,#fdfdfd);background-image:-webkit-gradient(linear,left top,left bottom,color-stop(0,#f8f8f8),color-stop(1,#fdfdfd));border:1px solid #d5d5d5;font-family:Lucida Grande,Arial,Helvetica,sans-serif;font-size:11px;height:118px;line-height:15px;overflow:hidden;width:300px}.carbonad-img{border:none;display:inline;float:left;height:100px;margin:9px;width:130px}.carbonad-text{display:inline;float:left;width:142px;padding-top:13px}.carbonad-text a{color:#000;text-decoration:none;text-transform:none;}.carbonad-tag{float:left;margin-top:9px;text-align:center;width:142px;color:#999}.carbonad-tag a{color:#999;text-decoration:none} | ||
</style> | ||
</head> | ||
<body> | ||
<div class="container"> | ||
<h2>Joint Modeling of Topics, Citations, and Topical Authority in Academic Corpora</h2> | ||
</div> | ||
<div class="container"> | ||
<h4><a href='#'>Jooyeon Kim</a>, <a href="http://arongdari.github.io//">Dongwoo Kim</a>, <a href="http://uilab.kaist.ac.kr/members/aliceoh/">Alice Oh</a></h4> | ||
<h4>TACL 2017</h4> | ||
|
||
|
||
<p><a href="./TACL2017.zip">Code & Datasets</a></p> | ||
|
||
|
||
</div> | ||
<div class="container"> | ||
|
||
<h4>Abstact</h4> | ||
<p> | ||
|
||
Much of scientific progress stems from previously published findings, but searching through the vast sea of scientific publications is difficult. We often rely on metrics of scholarly authority to find the prominent authors but these authority indices do not differentiate authority based on research topics. We present Latent Topical-Authority Indexing (LTAI) for jointly modeling the topics, citations, and topical authority in a corpus of academic papers. Compared to previous models, LTAI differs in two main aspects. First, it explicitly models the generative process of the citations, rather than treating the citations as given. Second, it models each author's influence on citations of a paper based on the topics of the cited papers, as well as the citing papers. | ||
We fit LTAI to four academic corpora: CORA, Arxiv Physics, PNAS, and Citeseer. We compare the performance of LTAI against various baselines, starting with the latent Dirichlet allocation, to the more advanced models including author-link topic model and dynamic author citation topic model. The results show that LTAI achieves improved accuracy over other similar models when predicting words, citations and authors of publications. | ||
|
||
</p> | ||
</div> | ||
<div class="container"> | ||
<p> | ||
Please contact Jooyeon Kim (jooyeon.kim_at_kaist.ac.kr) if you have any questions & comments. | ||
</p> | ||
</div> | ||
</body> | ||
</html> |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
@inproceedings{Jo:2011:ASU:1935826.1935932, | ||
author = {Jo, Yohan and Oh, Alice H.}, | ||
title = {Aspect and sentiment unification model for online review analysis}, | ||
booktitle = {Proceedings of the fourth ACM international conference on Web search and data mining}, | ||
series = {WSDM '11}, | ||
year = {2011}, | ||
isbn = {978-1-4503-0493-1}, | ||
location = {Hong Kong, China}, | ||
pages = {815--824}, | ||
numpages = {10}, | ||
url = {http://doi.acm.org/10.1145/1935826.1935932}, | ||
doi = {http://doi.acm.org/10.1145/1935826.1935932}, | ||
acmid = {1935932}, | ||
publisher = {ACM}, | ||
address = {New York, NY, USA}, | ||
keywords = {aspect detection, sentiment analysis, topic modeling}, | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
<html> | ||
<head> | ||
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> | ||
<title>Users & Information Lab</title> | ||
<link rel="shortcut icon" href="/images/uilab_fav.png" /> | ||
<script type="text/javascript"> | ||
var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); | ||
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); | ||
</script> | ||
<script type="text/javascript"> | ||
try { | ||
var pageTracker = _gat._getTracker("UA-2589526-2"); | ||
pageTracker._trackPageview(); | ||
} catch(err) {} | ||
</script> | ||
<style> | ||
div { margin: 0 0 50px 0; padding: 0; } | ||
p { margin: 0; padding: 0; } | ||
h2 { margin-bottom: 10px; } | ||
</style> | ||
<body> | ||
<div> | ||
<h1>Aspect and Sentiment Unification Model for Online Review Analysis</h1> | ||
<p>Yohan Jo (yohan.jo@kaist.ac.kr)</p> | ||
<p>Alice Oh (alice.oh@kaist.edu)</p> | ||
<p>Department of Computer Science, KAIST, Daejeon, Korea</p> | ||
<br /> | ||
<p><a href="http://portal.acm.org/citation.cfm?id=1935932&CFID=8905137&CFTOKEN=-32037455">[URL]</a> <a href="citation.bib">[bibTeX]</a></p> | ||
</div> | ||
<div> | ||
<h2>Data</h2> | ||
<ul> | ||
<li>Electronic device reviews (Amazon) <a href="Amazon.zip">[Link]</a></li> | ||
<li>Restaurant reviews (Yelp) <a href="Yelp.zip">[Link]</a></li> | ||
</ul> | ||
<p>NOTE: Some reviews may appear multiple times</p> | ||
</div> | ||
<div> | ||
<h2>Source Code</h2> | ||
<p><a href="ASUM.zip">Java Code</a></p> | ||
</div> | ||
<div> | ||
<p> If you have questions please send an email to <b>yohan.jo@kaist.ac.kr</b></p> | ||
</div> | ||
</body> | ||
</html> |
Binary file not shown.