Zur Seitennavigation oder mit Tastenkombination für den accesskey-Taste und Taste 1 
Zum Seiteninhalt oder mit Tastenkombination für den accesskey und Taste 2 
Startseite    Anmelden     
Logout in [min] [minutetext]

Seminar: Identifying Semantic Similarity for Plagiarism Detection - Einzelansicht

Veranstaltungsart Seminar Kurztext
Veranstaltungsnummer INF-20640-20152 Rhythmus jedes Semester
Semester WS 2015/16 Studienjahr
Erwartete Teilnehmer/-innen 7 Max. Teilnehmer/-innen 10
SWS 2 Sprache englisch
Credits 4
Hyperlink http://www.isg.uni-konstanz.de/teaching/
Termine iCalendar Export
  Tag Zeit Rhythmus Dauer Raum Raum-
Lehrperson Status Bemerkung fällt aus am Max. Teilnehmer/-innen
Einzeltermine anzeigen
iCalendar Export
Di. 11:45 bis 13:15 s.t. wöchentl. 20.10.2015 bis 13.02.2016  PZ - PZ 901       10

Zugeordnete Lehrpersonen
Zugeordnete Lehrpersonen Zuständigkeit
keine öffentliche Person
keine öffentliche Person
Zuordnung zu Einrichtungen
FB Informatik und Informationswissenschaft
JP Gipp (Informationswissenschaft)

Plagiarism encompasses the use of ideas, concepts, words, or structures without appropriately acknowledging the source to benefit in a setting where originality is expected. Academic plagiarism is a serious problem that harms society and the scientific process, because it distorts the mechanisms for tracing and correcting results. In the worst case, academic plagiarism can jeopardize lives, e.g., if medical or pharmaceutical studies are plagiarized and wrong findings affect later research or practical applications.

Plagiarism Detection is an information retrieval task supported by specialized information retrieval systems, called plagiarism detection systems. Today’s available plagiarism detection systems exclusively perform literal text string comparisons. These systems capably identify copies, but often fail to detect disguised plagiarism, such as paraphrases, translations, or idea plagiarism. The weakness of current systems results in a large fraction of today’s disguised forms of plagiarism going undetected.

The challenge of identifying semantic similarity of documents to detect disguised forms of plagiarism has attracted intense research that spans several fields including information retrieval, computational linguistics, big data management and analysis, information visualization.


Introductory Literature:

  • N. Meuschke and B. Gipp. State of the Art in Detecting Academic Plagiarism. International Journal for Educational Integrity, 9 (1): 50–71, June 2013. (pdf: http://www.ojs.unisa.edu.au/index.php/IJEI/article/view/847)
  • S. M. Alzahrani, N. Salim, and A. Abraham. Understanding Plagiarism Linguistic Patterns, Textual Features, and Detection Methods. In IEEE Transactions on Systems, Man, and Cybernetic–Part C: Applications and Reviews, volume 42, pages 133–149, Mar. 2012. doi:10.1109/TSMCC.2011.2134847. (pdf: http://isda01.softcomputing.net/smcc2011.pdf)
  • C. D. Manning, P. Raghavan, and H. Schütze. An Introduction to Information Retrieval. Cambridge University Press, Cambridge, England 2009. (free online edition: http://www-nlp.stanford.edu/IR-book/)

Topic-specific literature will be distributed during the initial meeting.


The seminar can serve as a starting point for a topically related bachelor’s or master’s project and thesis. For current project and theses proposals visit:http://www.isg.uni-konstanz.de/students-corner/




1. presentation (30 min)

2. term paper (8-10 pages per person, ACM style)

Group work is possible.


Seminar participants will explore current research approaches that tackle the challenges of identifying semantic similarity and detecting disguised plagiarism.

Participants will pick a research topic from a pool of suggestions that will be provided or will pick a topic according to their own interests. For their topic, the participants will give an overview of research relevant to the topic in a presentation during the seminar (30 min) and a term paper (8 - 10 pages per person, ACM style) due at the end of the seminar. Topic-specific literature suggestions will be provided during the initial meeting. Additional independent literature research and independent working by seminar participants is expected.


Seminar participants will gain an overview of the state-of-the-art technologies for plagiarism detection and their individual strengths and weaknesses. They will be able to describe the current research trends and challenges in plagiarism detection, as well as the predominant approaches for tackling these research challenges.

Each participant will perform an in-depth literature review on one current approach for identifying semantic similarity between documents and how this approach can be applied to identifying plagiarism. The participants will present their findings in an academic paper and a 30 minute long presentation during one of the seminar sessions. Through this process, which the lecturers supervise and guide, the participants will train their ability to:

  • find, organize, and systematically read relevant research papers
  • analyze, compare, and contrast research approaches and findings 
  • structure, write, and format an academic paper
  • present their work using appropriate presentation techniques and presentation aids
  • answer questions and discuss their work with peers

By successfully completing the seminar, participants will achieve valuable preparation in terms of the knowledge and methodological skills required to successfully complete a bachelor’s or master’s project related to identifying semantic document similarity (e.g. for plagiarism detection), as well as for many other information retrieval tasks.


Total workload: 4 ECTS = 120 hours

Keine Einordnung ins Vorlesungsverzeichnis vorhanden. Veranstaltung ist aus dem Semester WS 2015/16 , Aktuelles Semester: WS 2017/18
STUDIS    Anzahl aktueller Nutzer/-innen: 122 Haben Sie Anregungen, Fragen, Lob oder Kritik zum LSF?
Dann schreiben Sie uns!