Editing Twitter Analysis DB Details

Jump to navigation Jump to search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.

Latest revision Your text
Line 31: Line 31:
  
 
= The Database =
 
= The Database =
 
To understand the functionality of the application you should have some understanding of the data that it draws on: the database.  That is what I will I will do in this section.  If you wish skip over the more technical info and go right to the table descriptions.
 
  
 
The application is designed so that using it does not require writing any SQL ( Structured Query Language ), however, some familiarity with SQL and relational databases may be useful.  This is not an introduction to SQL but is a description of some of the ways it has been applied in this application.
 
The application is designed so that using it does not require writing any SQL ( Structured Query Language ), however, some familiarity with SQL and relational databases may be useful.  This is not an introduction to SQL but is a description of some of the ways it has been applied in this application.
Line 46: Line 44:
  
 
=== Tweets ===
 
=== Tweets ===
 
This is data drawn directly from twitter and restructured to be loaded into a database table.  You want to select against this table if you want to see the tweet text.  This is also the only table with date information.
 
  
 
* tweets  -- the actual tweet and supporting columns:
 
* tweets  -- the actual tweet and supporting columns:
 
** tweet_id  -- identifies the tweet
 
** tweet_id  -- identifies the tweet
** tweet_datetime -- when tweeted, has both a date and a time.  I have extracted an extra column from it:
+
** tweet_datetime -- when tweeted
** time_of_day -- the time of day of the tweet.  For both date and time I am uncertain how these relate to the timezone of the tweet record and of the tweeter.
 
 
** tweet -- text of the tweet
 
** tweet -- text of the tweet
** who -- who tweeted -- so far no real support for multiple tweeters, but not too hard to add to the rest of the application.  Is populated in the database load.
+
** who -- who tweeted -- so far no real support for multiple tweeters, but not too hard to add to the rest of the application  
 
** tweet_type -- a tweet by the author or a retweet
 
** tweet_type -- a tweet by the author or a retweet
 
** is_covid  -- and indicator that suggest the tweet is covid related or not.  Not in source data, generated by the table load routine, for now see the code.
 
** is_covid  -- and indicator that suggest the tweet is covid related or not.  Not in source data, generated by the table load routine, for now see the code.
Line 74: Line 69:
 
=== Words ===
 
=== Words ===
  
This is a table of words taken from common usage and then loaded into a database table "words".  It may help you see how the usage of vocabulary in the tweets corresponds/differs from the more general use of the language.
+
This is a table of words taken from common usage and then loaded into a database table "words"
  
  
Line 80: Line 75:
 
** word -- the word ( again all lower case and normalized )
 
** word -- the word ( again all lower case and normalized )
 
** word_count -- count of the uses in the analyzed body of text.
 
** word_count -- count of the uses in the analyzed body of text.
** word_rank -- index starting at 1 and ascending from the highest_word_count to the lowest.
+
** word_rank -- index starting at 1 and acending from the highest_word_count to the lowest.
  
  

Please note that all contributions to OpenCircuits may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see OpenCircuits:Copyrights for details). Do not submit copyrighted work without permission!

Cancel Editing help (opens in new window)