• Profile Photo

    "Hello World"

    The preliminary results using the standard and modified NLTK bayesian classifier were almost suspiciously high. As a way to double check the results I used an NLTK feature that shows you the features that most […]

  • Profile Photo

    "Hello World"

    I presented an overview of my results in class, but wanted to take a minute to discuss my preliminary results here. My most meaningful set so far has been an attempt to classify author using the complete works of […]

  • Profile Photo

    "Hello World"

    Python has been very easy to work with, but larger datasets are starting to cause problems. Some of the issues may be related to the IDLE shell that I am working with, although some research has shown many other […]

  • Profile Photo

    "Hello World"

    One of NLTK’s better features is the standard tokenizer. It integrates particularly well with the default dictionaries and corpora NLTK makes available. It only takes a single line to generate a list of stop […]

  • Profile Photo

    "Hello World"

    One of the most irritating issues I’ve had with Python has been the stability of the default IDLE interface. While I understand that most languages can be used to crash a machine, but it is far too easy in python […]

  • Profile Photo

    "Hello World"

    NLTK offers a number of different models or object types to represent the data you’re working with. One of the most common types in NLTK is a corpus which represents a body of texts to analyze. Although it […]

  • Profile Photo

    "Hello World"

    While I was reading through the code of NLTK’s bayesian classifier I noticed something a little unusual about it. Naive bayes can be based on two different underlying models of the features. The first is a […]

  • Profile Photo

    "Hello World"

    I originally was working with WingIDE, but have almost completely switched over to IDLE. It is not as full featured as something like Wing, but there are a few major pluses. First, it is part of Python, nothing […]

  • Profile Photo

    "Hello World"

    NLTK has a built in corpus datatype, as well as access to a number of common corpora (like the Brown Corpora), many of which are already tagged. This makes it easy to jump in and practice using things like the […]

  • Profile Photo

    "Hello World"

    The first model that I am working with is based on naive bayes with the frequency of words as the feature being measured. Although this is a standard baseline to work with, I am looking for some more sophisticated […]

  • Profile Photo

    "Hello World"

    So for my real project I will need Moretti’s underlying data from his project. Without that not only do I not have classifications, I don’t even know what books are being analyzed. So until then I chose to work […]

  • Profile Photo

    "Hello World"

    Python has a number of specialized IDEs which was interested in trying. Since Python emphasizes easy of development, I was hoping someone had created an easy to use IDE. There are many good languages out there […]

  • Profile Photo

    "Hello World"

    Python currently exists in two main variations: python version 2.x and version 3.x. Python 3 is now over 4 years old, but is not backwards compatible with python 2. This means all programs and libraries written in […]

  • Profile Photo

    "Hello World"

    I’d like to use this space not only to record my experiences with the Natural Language Toolkit, but also with Python itself. Although I have extensive programming experience, I have never used python, and it is […]

  • Profile Photo

    "Hello World"

    So the project that I am working on is a text classification project. I am currently working with python and the natural language tool kit to build some basic text classifiers. I am trying to repeat Moretti’s […]

  • Profile Photo Profile Photo
  • Profile Photo
  • Profile Photo

    DH 2013 Tools and Methods

    Daniel Terry commented on the post, Welcome to DH methods 2013!

    Hello everyone. I am a part time student in the MALS program in the Digital Humanities track. I am also a full time systems engineer at a financial company, so I am coming from a fairly technical background. I […]

  • Profile Photo
  • Profile Photo

    MALS 78100 – The Digital Humanities in Research and Teaching

    Daniel Terry commented on the post, Graphs Maps and Trees

    The reminded me of an article I read recently about children’s books and maps (and how the best books had maps). I remember maps being much more common in children’s books than in “adult” books. Maybe we were just […]

  • Load More