Help Page Index


BILL INFORMATION

DETAILED TIPS ABOUT SEARCHING
  • Stop words
  • Natural Language Search
  • Literal Strings
  • Boolean Operators
  • Fielded Search
  • Date and Numeric Ranges
  • Right Truncation (Wildcards)
  • Grouping Search Terms
  • Relevance Ranking
  • Proximity Relationships
  • Special Characters
  • Limited Hits

    SUBSCRIPTION SERVICES (e-mail)

    BILL INFORMATION

    Bill information is available for the current, prior and past sessions. The default is Current Session. If you wish to display bill information for a prior or past session, select that session from the drop down box.

    If you know the bill number, click on house designation (Assembly or Senate), type the measure number in the box provided, and click on Search. A list of all measures with that bill number for the House selected will be displayed. These measures may include the following types of bills introduced in both the regular and the extraordinary sessions:

    If you do not know the house of origin for the bill, entering the bill number and selecting both will return a list of all measures with that bill number from both houses..

    If you do not know the bill number, you may search for bills by keyword(s), author(s), or both keyword(s) and author(s).

    For Keyword(s) Searching: type the word(s) in the space provided. The system will return a list of all the bills that contain all the entered keyword(s).

    For Author(s) Searching: type the last name of the Legislator in the space provided. If there is more than one Assembly Member or Senator with the same last name, type both the first and last name. The system will return a list of all the bills authored by the Legislator.

    For both Author(s) and Keyword(s) Searching: type the keyword(s) and author in the space provided. The system will return a list of all the bills authored by the Legislator that contain all the entered keyword(s).


    DETAILED TIPS ABOUT SEARCHING

    Special Search Note

    Words so common that they occur in almost every document are called stop words. These stop words cannot be used for searching a document because they occur so frequently that they are not useful for distinguishing one document from another. Stop words are ignored during a search. If they are inadvertently used for a search, the results may be undesirable.

    Natural Language Search

    The server can be queried using natural language questions. The server does not understand the question, rather it takes the words and phrases in the question and finds documents that have those words and phrases in them. "Tell me about portable computers." is an example of a natural language question. In this example, the WAIS server would search for documents containing the words 'portable' and 'computers'; the other words, 'tell', 'me', and 'about', are called "stop words" -- words so common that they occur in almost every document and so they are not used for searching a document.

    Literal Strings

    A similar but more specific kind of query asks to find documents that contain one or more exact phrases by enclosing them in double quotation marks. This is known as a literal. For example, the query

      "search engine capabilities"

    returns only documents that contain this exact phrase. The WAIS search engine performs a literal search exactly as if you had used the boolean operator ADJ. Thus the above example would yield the same results as

      search ADJ engine ADJ capabilities

    For this reason, it is best to stick to noun phrases when using literals; if your literal phrase includes stopwords, the stopwords will be ignored.

    Boolean Operators

    The boolean operators, AND, OR, NOT, and ADJ aid in establishing logical relationships between concepts expressed in natural language. These operators are especially useful in narrowing down the search.

    AND, &&

    The AND operator is helpful in restricting a search when a particular pair or larger group of terms is known. For instance, when searching for documents on the weather in Boston, a question such as "weather AND Boston" would return only those documents that contain both the word "weather" and the word "Boston". You can use more than one AND in a query, e.g. "weather AND Boston AND November". Note that the C-like double ampersand (&&) may be used instead of spelling out the word AND.

    OR, ||

    The OR operator is often used to join two different phrases of a Boolean search. A question such as "hurricane OR tornado" would search for all documents containing either the word "hurricane", or the word "tornado", or both. You can also use more than one OR in a query. A natural language question is much like having an implicit OR between the words, except that the search engine does more work in a natural language query to determine the relevance of words and their relationships in a phrase. Note that the C-like double vertical bars (||) may be used instead of spelling out the word OR.

    NOT

    NOT is a binary operator. That is, it has to come between two or more words or parenthesized clauses. NOT is used to reject any documents that contain certain words. The question "basketball NOT college" would find all documents containing the word "basketball", that do not also contain the word "college". Note, however, that this question would eliminate articles on any professional players that mention their alma maters; in other words, be careful not to limit your search too much with the NOT operator, make sure that you know what you're throwing away.

    Don't be afraid to use NOT! One good search strategy is to search for a broadly occurring term and get lots of documents you don't want, and then to use NOT to filter out the bad documents. For example, if you're trying to cook okra, you might search for "cooking AND okra" and find nothing; but if you search for "cooking", you find lots of articles on cooking meats and pastas. You then can search for "cooking NOT meat NOT pasta", and you might find more interesting articles that eventually lead you to your goal. Another handy trick is to use NOT to "break the 40 barrier". Typical WAIS clients only display 40 documents, but if you use NOT wisely, you can flush out the documents you don't like in those 40 and progressively refine your search, adding better and better documents to the 40 that you see.

    ADJ

    The adjacent operator, ADJ, is used to ensure that one word is followed by another in the returned document, with no other words in between. For example, "cordless ADJ telephone" returns only documents containing "cordless telephone" and ignores documents that only contain one of the words or that contain both but not adjacent to one another. ADJ will nonetheless work when stopwords interrupt two words; for example, the preceding example will find occurrences of "cordless for telephone". Note that the ADJ operator yields the same results as does a literal query. Also note that ADJ, unlike AND, OR, and NOT, is not a commutative property - "telephone ADJ cordless" does not work the same as "cordless ADJ telephone".

    Mixing Natural Language, Literals, And Booleans

    The ability to mix natural language, literals, and boolean operators is unique to the WAISserver search engine. Combining natural language and boolean operators enables end users to better target their searches. For example, suppose you were looking for documents specifically on portable laptop computers that are not made by Tosuji Corporation. The question could then be "Tell me about portable laptop computers NOT Tosuji."

    Fielded Search

    For data sets whose documents have special data fields, selected portions of the documents can be tagged by the WAIS parser as fields. A client can then ask a WAIS server to limit its search to those documents containing a user- specified value of a particular field. This is called a fielded search.

    The mail-or-email parse format is an example of a parse format in which fields are tagged. For this parse format, the WAIS parser detects the "to" and "cc" fields, the "from" and "sender" fields, the "subject" field, and the "date" field. An example of a question using natural language, a boolean operator, and fielded search is: "company picnic AND from=barbara". The WAIS server would then find email messages about a company picnic that Barbara sent.

    Date and Numeric Ranges

    For a date or numeric field, a range may be specified using the syntax

      field-name    comparison-operator    value

    where comparison-operator may be one of > (greater than), < (less than), >= (greater than or equal to), <= (less than or equal to), or = (equal to).

    Currently, dates with the following formats are supported:

                  m-d-yy    m-d-yyyy    mm-dd-yy 
                  m/d/yy    mm/dd/yy    m.d.yy 
                  today     yesterday

    Only positive integers are supported for numeric fields. If the comparison operator is =, then the range may be specified using the word TO, as in

               date = 4/15/93 TO 4/14/94

    Both ends of the range are inclusively specified.

    Right Truncation (Wildcards)

    A user can specify right truncation by ending a word with the asterisk (*) wild card character. This tells the search engine to search on words whose first several characters match the base characters before the *. For example, you might use right truncation in a question such as geo*, which may retrieve documents containing the words: geographer, geography, geologist, geometry, geometrical, etc.

    Grouping Search Terms

    A user can group search terms and phrases together using parentheses.

    For example, if you wish to search for information about snowstorms, tornadoes, or hurricanes in New York City, you might search for "(snowstorms OR tornadoes OR hurricanes) AND (New ADJ York ADJ City)." You can also nest your parentheses; for example, "from = ( (ben ADJ wais) OR (brewster ADJ think) )" searches for messages from either ben@wais.com or brewster@think.com. When you're using several boolean operators, you should always group, to disambiguate how the operators are to be applied.

    Relevance Ranking

    When documents are returned after a search has completed, the order in which the documents appear on the screen is based on what is known as relevance ranking. Each document is scored based on its relevance to a user's question, where the most relevant document has the highest score, or rank -- 1000 being the highest, 1 being the lowest. Documents are returned to the screen in order of the document with the highest score listed first and the document receiving the lowest score appearing last.

    A document receives a higher score if the words in the question are in the headline, if the words appear many times, or if phrases occur as they do in the question. A document's score is derived using techniques such as word weighting, term weighting, proximity relationships, and word density. These scoring techniques are outlined below.

    Proximity Relationships

    Proximity relationship scoring specifies that if the words in a natural language question are located close together in a document, they are given a higher weight than those found further apart. The idea behind a proximity relationship is that if a document contains a phrase similar to one in the user's question, that document is more likely to be relevant.

    Special Characters

    The WAIS server was originally designed to be as general as possible and, in this spirit, it ignores all characters in a document that are not either an alphabetical letter or a number. In fact, non-alphanumeric characters usually separate words for the parser, for example, "F.Y.I." parses out to three words. This rule also applies to queries used to search a directory of servers.

    Limited Hits

    The number of hits is the number of times the bill is found in the database. For every version of a bill, i.e. introduced, amended, etc., an occurrence of the bill is stored in the database and is considered one hit. The most current version of a bill is kept for display. The database also uses an additional hit for housekeeping purposes for each search. Therefore the number of hits to the number of documents displayed is not one to one.

    For example, we searched for the word "DOG". The search found "DOG" in two bills Bill_1 and Bill_2.
    Bill_1 has four versions, (1)introduced, (2)amended,and (1)enrolled. Bill_2 has two versions, (1)introduced and (1)amended.
    Each version is considered a hit. In this example there are seven hits.

        Bill_1   four (versions)
        Bill_2   two (versions)
        plus one (housekeeping)
        For a total of seven hits.

    We have found that using the default setting of 50 hits will provide 8-12, on average, documents for display.
    For best performance results, it is highly recommend that the default setting is used.


    SUBSCRIPTION SERVICES (E-mail)

    The Subscription service will notify you by email every time there is action taken on a bill. To Subscribe to a bill, select Bill Information page. On the Bill Information page, select the house (Assembly or Senate), enter the number of the bill you wish to subscribe to and click search. Your result will be displayed on the Bill Document page. From here scroll to the bottom of the screen and click on the Subscribe button. Enter your email address in the box provided and click OK. You can subscribe to more than one bill, but you must subscribe to each bill individually. If you receive an error message "Document contains no data", you have subscribed to the maximum number of bills allowed. To subscribe to additional bills, please use a different email address.

    You can Unsubscribe by following the same procedure to display the Bill Document screen, then select Unsubscribe at the bottom of the screen, or you can select Subscription List on the Bill Information page. Enter your email address and the list of bills you have subscribed to will display. Check those bills you wish to Unsubscribe and click Update.

    You can view a list of bills you have subscribed to by selecting Subscription List from the Bill Information page. Enter your email address and click OK.

    E-mail Defined

    In your E-mail message you will find the bill number that you have subscribed to, that includes a link to the whole bill. Below that, you will find the section(s) of the bill that has been updated, including the link to that section.

    The Bill includes the following sections:

    History
    Status
    Text
    Analysis
    Votes
    Vetos

    We have provided the address(s) for you, so you can hyperlink directly to our site. If your E-mail program does not translate the address into a hyperlink, enter the address as your browsers URL.