Tokens in Python

Hey guys, this is the second tutorial of the Python for Beginners Series. If you haven’t read the first one i.e, Getting Started with Python yet then I suggest checking that out first. Also, this is going to be mostly theoretical with almost no coding involved but is very important.
Do not skip this if you don’t know about tokens in Python.

The Python Character Set
What are tokens?
Types of Tokens

The Python Character Set

It is the set of valid characters that python understands. Python uses the Unicode character set.
It includes numbers from numbers like
Digits: 0-9,
Letters: a-z, A-Z
Operators: +,-,/,*,//,**
Punctuators like : (colon), (), {}, []
Whitespaces like space, tabs, etc

What are tokens?

Tokens are building blocks of a language. They are the smallest individual unit of a program. There are five types of tokens in Python and we are going to discuss them one by one.

Types of Tokens

So the five types of tokens supported in Python are Keywords, Identifiers, Literals, Punctuators, and Operators. Coming over to the first one we have

Keywords

Keywords are the pre-defined set of words in a language that perform their specific function. You cannot assign a new value or task to them other than the pre-defined one.

You cannot use them as a variable, class, function, object or any other identifier.

For example: if, elif, while, True, False, None, break etc

These have their special task that you cannot change. For example break will only end the loop you cannot make it start the loop. (We’ll be covering loops in future lectures).

Identifiers

Now identifiers are the names that you can assign a value to. An identifier can be anything for example,

a = 10

Here, a is a valid identifier name. Any name you give your variable, function, or class is an identifier of that particular thing. Now there are certain rules that you have to follow to define a valid identifier name.

Rules for valid identifier name

A valid identifier name can have letters, digits, and underscore sign.
It can start with an alphabet or underscore but can never start with a digit.
It can never be a keyword name.
An identifier name can be of variable length.
The only special symbol that can be used in identifier name is underscore( _ ).

One more thing that you should remember that python is case sensitive i.e.,

a = 10
A = 5

These two hold two different values. a holds the value 10 and A holds the value 5.

Examples of valid identifier names include: a, _a, a12, etc
Examples of invalid identifier names include: 1a, $a, elif, print.
If you don’t understand why they are valid/invalid, read the rules again.

Literals

Literals are the fixed or constant values. They can either be string, numeric or boolean.

For example, anything within single or double quotes is classified as a string and is literal because it is a fixed value i.e, “Coding Ground” is literal because it is a string.

Another example is, 10. It is a simple number but is a fixed value. It is constant and it will remain constant. You can perform operations like addition or subtraction but the value of these two characters 1 and 0 put together in a correct order gives them a value equal to ten and that cannot be changed.

Boolean only consist of 2 values, True and False. Remember that “T” of True is capital. Python is case sensitive and if you write True with small “t” like true it will hold a different meaning. It will act as another variable.

It might seem a bit confusing for now but you’ll definitely understand it in our future lectures on boolean and other data types.

Punctuators or Separators

Punctuators, also known as separators give a structure to code. They are [mostly] used to define blocks in a program. We will be covering code blocks in control flow statements when we discuss how to apply conditions in Python.
Some examples of punctuators include
single quotes – ‘ ‘ , double quote – ” ” , parenthesis – ( ), brackets – [ ], Braces – { },
colon – ( : ) , comma ( , ), etc.
Punctuators and operators go hand in hand are used everywhere. For example,

name = "coding ground"

Here, an assignment operator ( = ) and punctuator, (” “) is used.

Operators

Operators are the symbols which are used to perform operations between operands.

Unary Operators: Operators having single operand.
Eg. +8, -7, etc
Binary Operators: Operators working on 2 operands.
Eg. 2+2, 4-3, 8*9, etc
Similarly, there are Ternary Operators that work on 3 operands and so on.
These are just basics and not so important to know but the operators listed below are very important

Arithmetic operators ( +, -, /, * etc)
Assignment operators ( = )
Comparison operators ( >, <, >=, <=, ==, !=)
Logical operators ( and, or, not)
Identity operators ( is, is not)
Membership operators ( in, not in)
Bitwise operators ( &, |, ^ etc)

There’s so much about operators that it cannot be written here as it will be out of the scope of this topic. A new detailed article on tokens will be there very soon.

Reference to Punctuators:
https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_74/rzarg/punctuators.htm
https://download.mikroe.com/documents/compilers/mikroc/pic/help/punctuators.htm