Running Python Scripts

Session Overview

Welcome to our deep dive into running Python scripts! While the REPL is excellent for exploration and experimentation, most real Python development happens in script files. Today, we'll explore various ways to execute Python scripts, pass arguments to them, manage their execution environment, and incorporate them into larger systems. These skills form the foundation of practical Python programming.

Understanding Python Scripts

Python scripts are text files containing Python code that can be executed as a complete program. Unlike interactive REPL sessions, scripts allow you to save your code, run it repeatedly, automate tasks, and build larger applications.

What Makes a Python Script

File extension: Python scripts typically use the .py extension
Executable code: Contains Python code that runs from top to bottom
Reusability: Can be run repeatedly with the same or different inputs
Modularity: Can be imported into other scripts or the REPL

Creating Your First Script

Let's create a simple "Hello World" script:

Open a text editor (VS Code, Sublime Text, Notepad++, etc.)
Create a new file called hello_world.py
Add the following code:

# This is a comment in Python
print("Hello, World!")
print("Welcome to Python programming!")

# Variables and simple calculation
name = "Python Learner"
experience_years = 5
print(f"{name} has {experience_years} years of programming experience.")
print(f"In 2 more years, they will have {experience_years + 2} years of experience.")

This simple script demonstrates several key concepts:

Comments (lines starting with #)
Print statements for output
Variable declarations and usage
String formatting with f-strings
Basic arithmetic operations

Analogy: Scripts vs. Interactive Sessions

Think of the difference between Python scripts and REPL sessions like the difference between writing a letter and having a conversation:

REPL (Conversation): Immediate back-and-forth, good for exploration and quick questions, but ephemeral
Script (Letter): Carefully crafted, can be reviewed and edited before "sending," permanently recorded, can be referenced later

Just as you would choose a letter for important, reusable communication and a conversation for exploration, you choose between scripts and REPL based on your programming needs.

Basic Ways to Run Python Scripts

Method 1: Command Line Execution

The most common way to run a Python script is from the command line:

# On systems with Python as the default interpreter
python hello_world.py

# On systems with both Python 2 and 3 installed
python3 hello_world.py

This invokes the Python interpreter and passes your script file as an argument. The interpreter reads the file, compiles it to bytecode (an intermediate representation), and then executes it.

Method 2: Integrated Development Environments (IDEs)

Most Python IDEs provide a "Run" button or keyboard shortcut to execute the current script:

VS Code: Press F5 or use the Run button
PyCharm: Right-click in the editor and select "Run" or press Shift+F10
IDLE: Press F5 or use the Run menu

IDEs often provide additional features such as:

Integrated terminal output
Debugging capabilities
Variable inspection
Performance profiling

Method 3: File Explorer (Windows)

On Windows, if Python is correctly associated with .py files, you can double-click a Python script in File Explorer to run it. However, this method has limitations:

The console window may close immediately after execution
You cannot easily provide command-line arguments
This method is not suitable for scripts that require user input

For scripts that need to stay open after execution on Windows, add this at the end:

input("Press Enter to exit...")

Advanced Script Execution Modes

Making Scripts Executable (Unix/Linux/macOS)

On Unix-based systems, you can make Python scripts directly executable:

Add a shebang line at the top of your script:

#!/usr/bin/env python3
print("This script is directly executable!")

The shebang line (#!/usr/bin/env python3) tells the system which interpreter to use for executing the script.

Make the script executable using chmod:

chmod +x my_script.py

Run the script directly:

./my_script.py

This approach is common in system automation and DevOps workflows.

Running as a Module

Python can run scripts as modules using the -m flag:

python -m my_module

This is different from direct execution in several ways:

Python adds the current directory to sys.path
The module's __name__ is set to __main__
You don't need to include the .py extension
The module must be importable (e.g., valid Python package structure)

This approach is commonly used for built-in modules with runnable functionality:

# Run the HTTP server module
python -m http.server 8000

# Run the unit test discovery module
python -m unittest discover

Interactive Mode with Scripts

You can run a script and then drop into an interactive session using the -i flag:

python -i my_script.py

This executes the script and then starts the REPL with all the script's variables and functions available for interactive use. This is extremely useful for debugging and exploring the state after script execution.

Command-Line Arguments

Command-line arguments allow users to provide input to scripts at runtime, making them more flexible and reusable.

Basic Argument Handling with sys.argv

The simplest way to handle command-line arguments is using the sys.argv list:

import sys

# sys.argv[0] is the script name
# sys.argv[1:] are the arguments passed to the script

if len(sys.argv) > 1:
    name = sys.argv[1]
    print(f"Hello, {name}!")
else:
    print("Hello, stranger! Please provide your name as an argument.")

Save this as greet.py and run it with:

python greet.py Alice

The output will be:

Hello, Alice!

Advanced Argument Parsing with argparse

For more complex argument handling, use the argparse module from the standard library:

import argparse

# Create an argument parser
parser = argparse.ArgumentParser(description='A greeting script with options.')

# Add arguments
parser.add_argument('name', help='Name of the person to greet')
parser.add_argument('--title', '-t', help='Title for the person')
parser.add_argument('--repeat', '-r', type=int, default=1, help='Number of times to repeat the greeting')

# Parse arguments
args = parser.parse_args()

# Use the arguments
greeting = f"Hello"
if args.title:
    greeting += f", {args.title}"
greeting += f" {args.name}!"

for _ in range(args.repeat):
    print(greeting)

Save this as advanced_greet.py and run it with various arguments:

python advanced_greet.py Alice --title Dr. --repeat 3
python advanced_greet.py Bob -t Mr. -r 2
python advanced_greet.py --help

The argparse module provides many benefits:

Automatic help message generation
Type conversion and validation
Short and long argument formats
Required vs. optional arguments
Default values

Analogy: Command-Line Arguments as Function Parameters

Command-line arguments are like parameters to a function:

They allow you to pass data into your script
They can have default values
They can be required or optional
They can be validated or converted to specific types

Just as a well-designed function has clear parameters, a well-designed script has clear command-line arguments that make it flexible and reusable.

Script Execution Environment

Environment Variables

Scripts can access environment variables to configure their behavior:

import os

# Access environment variables
db_url = os.environ.get('DATABASE_URL', 'sqlite:///default.db')
debug_mode = os.environ.get('DEBUG', 'False').lower() == 'true'

print(f"Database URL: {db_url}")
print(f"Debug mode: {debug_mode}")

# For development, you can set environment variables before running
# export DATABASE_URL="postgresql://user:pass@localhost/mydb"
# export DEBUG="True"

This approach allows you to change script behavior without modifying code, which is especially useful for:

Different deployment environments (development, testing, production)
Sensitive information (API keys, passwords)
User-specific configuration

Working Directory and File Paths

Scripts often need to work with files in specific locations:

import os

# Get the current working directory
current_dir = os.getcwd()
print(f"Current directory: {current_dir}")

# Get the directory containing the script
script_dir = os.path.dirname(os.path.abspath(__file__))
print(f"Script directory: {script_dir}")

# Construct paths relative to the script
data_path = os.path.join(script_dir, 'data', 'input.csv')
print(f"Data file path: {data_path}")

# Check if a file exists
if os.path.exists(data_path):
    print(f"Data file exists: {data_path}")
else:
    print(f"Data file does not exist: {data_path}")

Using __file__ to find the script directory makes your code more robust, as it works regardless of the current working directory when the script is launched.

Exit Codes

Scripts can communicate their execution status through exit codes:

import sys

def process_data(filename):
    try:
        with open(filename, 'r') as f:
            # Process the file...
            print(f"Successfully processed {filename}")
            return True
    except FileNotFoundError:
        print(f"Error: File not found: {filename}")
        return False
    except Exception as e:
        print(f"Error processing file: {e}")
        return False

if __name__ == "__main__":
    if len(sys.argv) < 2:
        print("Error: Please provide a filename")
        sys.exit(1)  # Exit with error code 1
    
    filename = sys.argv[1]
    success = process_data(filename)
    
    if success:
        sys.exit(0)  # Exit with success code 0
    else:
        sys.exit(2)  # Exit with error code 2

Exit codes are important for:

Scripts called from other programs or scripts
Batch processing and automation
Error handling in shell scripts

By convention, exit code 0 indicates success, while any non-zero value indicates an error.

Script Modularity and Reusability

The name == "main" Pattern

A common pattern in Python scripts is the if __name__ == "__main__": check:

# math_utils.py
def add(a, b):
    """Add two numbers and return the result."""
    return a + b

def multiply(a, b):
    """Multiply two numbers and return the result."""
    return a * b

# This block only runs when the script is executed directly
if __name__ == "__main__":
    print("Testing math utilities:")
    print(f"5 + 3 = {add(5, 3)}")
    print(f"4 * 6 = {multiply(4, 6)}")
    
    # You could also add command-line parsing here
    # import sys
    # a = int(sys.argv[1])
    # b = int(sys.argv[2])
    # print(f"{a} + {b} = {add(a, b)}")
    # print(f"{a} * {b} = {multiply(a, b)}")

This pattern provides dual functionality:

When run as a script (python math_utils.py), the test code executes
When imported as a module (import math_utils), only the functions are defined, but the test code doesn't run

This makes your code both executable and importable, which is a cornerstone of Python's reusability.

Creating Executable Modules

You can structure a Python package to be both importable and executable:

# my_package/__main__.py
"""
This file makes the package directly executable with:
python -m my_package
"""
from .core import main

if __name__ == "__main__":
    main()

# my_package/core.py
def main():
    """Main function implementing the core functionality."""
    print("Running the main package functionality!")
    # ... actual code here ...

def helper_function():
    """A helper function used by main()."""
    return "Helper result"

This structure allows for:

Running as a module: python -m my_package
Importing specific functions: from my_package.core import helper_function
Clean separation between execution logic and core functionality

Organizing Larger Scripts

As scripts grow, organize them into functions with a clear entry point:

#!/usr/bin/env python3
"""
A data processing script that demonstrates good organization.
"""
import argparse
import logging
import os
import sys

def setup_logging(verbose=False):
    """Configure logging based on verbosity level."""
    level = logging.DEBUG if verbose else logging.INFO
    logging.basicConfig(level=level, format='%(levelname)s: %(message)s')

def parse_arguments():
    """Parse and return command-line arguments."""
    parser = argparse.ArgumentParser(description="Process data files.")
    parser.add_argument('input', help='Input file path')
    parser.add_argument('output', help='Output file path')
    parser.add_argument('-v', '--verbose', action='store_true', help='Enable verbose output')
    return parser.parse_args()

def read_data(input_path):
    """Read and parse the input data file."""
    logging.info(f"Reading data from {input_path}")
    try:
        with open(input_path, 'r') as f:
            return f.readlines()
    except Exception as e:
        logging.error(f"Failed to read input file: {e}")
        sys.exit(1)

def process_data(data):
    """Process the input data and return the results."""
    logging.info(f"Processing {len(data)} lines of data")
    # ... processing logic here ...
    return [line.upper() for line in data]  # Example: convert to uppercase

def write_results(output_path, results):
    """Write the processed results to the output file."""
    logging.info(f"Writing results to {output_path}")
    try:
        with open(output_path, 'w') as f:
            f.writelines(results)
    except Exception as e:
        logging.error(f"Failed to write output file: {e}")
        sys.exit(2)

def main():
    """Main entry point for the script."""
    args = parse_arguments()
    setup_logging(args.verbose)
    
    logging.debug("Starting data processing job")
    
    data = read_data(args.input)
    results = process_data(data)
    write_results(args.output, results)
    
    logging.info("Processing completed successfully")
    return 0

if __name__ == "__main__":
    sys.exit(main())

Benefits of this organization:

Each function has a single responsibility
Clear entry point through main()
Proper error handling and logging
Testable components
Exit code management

Analogy: Well-Structured Scripts as Recipes

A well-structured script is like a professional recipe:

Ingredients (arguments, inputs) are clearly listed at the beginning
Each step (function) has a specific purpose and clear instructions
Steps are performed in a logical order
The recipe can be scaled or adapted for different situations
Experienced chefs (developers) can reuse components in other recipes

Just as a good recipe is easy to follow and adapt, a well-structured script is easy to understand and maintain.

Real-World Script Examples

Data Processing Script

This script processes CSV data, a common task in data analysis:

#!/usr/bin/env python3
"""
Process sales data to generate a summary report.
"""
import csv
import argparse
from collections import defaultdict
from datetime import datetime

def parse_args():
    parser = argparse.ArgumentParser(description='Generate sales report from CSV data')
    parser.add_argument('input_file', help='Input CSV file path')
    parser.add_argument('output_file', help='Output report file path')
    parser.add_argument('--year', type=int, help='Filter by year')
    return parser.parse_args()

def process_sales_data(input_file, year_filter=None):
    sales_by_region = defaultdict(float)
    sales_by_product = defaultdict(float)
    total_sales = 0.0
    
    with open(input_file, 'r', newline='') as csvfile:
        reader = csv.DictReader(csvfile)
        for row in reader:
            # Parse the date
            date = datetime.strptime(row['date'], '%Y-%m-%d')
            
            # Apply year filter if specified
            if year_filter and date.year != year_filter:
                continue
                
            # Extract data
            region = row['region']
            product = row['product']
            amount = float(row['amount'])
            
            # Update our aggregations
            sales_by_region[region] += amount
            sales_by_product[product] += amount
            total_sales += amount
    
    return {
        'total_sales': total_sales,
        'sales_by_region': sales_by_region,
        'sales_by_product': sales_by_product,
    }

def write_report(output_file, data):
    with open(output_file, 'w') as f:
        f.write("SALES REPORT\n")
        f.write("=" * 40 + "\n\n")
        
        f.write(f"Total Sales: ${data['total_sales']:.2f}\n\n")
        
        f.write("Sales by Region:\n")
        for region, amount in sorted(data['sales_by_region'].items()):
            f.write(f"  {region}: ${amount:.2f}\n")
        f.write("\n")
        
        f.write("Sales by Product:\n")
        for product, amount in sorted(data['sales_by_product'].items()):
            f.write(f"  {product}: ${amount:.2f}\n")

def main():
    args = parse_args()
    data = process_sales_data(args.input_file, args.year)
    write_report(args.output_file, data)
    print(f"Report written to {args.output_file}")

if __name__ == "__main__":
    main()

Automation Script

This script automates a common development workflow:

#!/usr/bin/env python3
"""
Automate the process of updating code, running tests, and deploying if tests pass.
"""
import os
import subprocess
import argparse
import logging
import sys

def setup_logging(verbose=False):
    level = logging.DEBUG if verbose else logging.INFO
    logging.basicConfig(
        level=level,
        format='%(asctime)s - %(levelname)s - %(message)s',
        datefmt='%Y-%m-%d %H:%M:%S'
    )

def parse_args():
    parser = argparse.ArgumentParser(description='Automate code update and deployment')
    parser.add_argument('repo_dir', help='Repository directory')
    parser.add_argument('--branch', default='main', help='Branch to update')
    parser.add_argument('--deploy', action='store_true', help='Deploy if tests pass')
    parser.add_argument('-v', '--verbose', action='store_true', help='Verbose output')
    return parser.parse_args()

def run_command(command, cwd=None):
    """Run a shell command and return its output and status."""
    logging.debug(f"Running command: {command}")
    try:
        result = subprocess.run(
            command,
            shell=True,
            cwd=cwd,
            check=True,
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            text=True
        )
        return True, result.stdout
    except subprocess.CalledProcessError as e:
        return False, e.stderr

def update_code(repo_dir, branch):
    """Pull the latest code from the repository."""
    logging.info(f"Updating code in {repo_dir} (branch: {branch})")
    
    # Ensure we're on the right branch
    success, output = run_command(f"git checkout {branch}", cwd=repo_dir)
    if not success:
        logging.error(f"Failed to checkout branch {branch}: {output}")
        return False
    
    # Pull the latest changes
    success, output = run_command("git pull", cwd=repo_dir)
    if not success:
        logging.error(f"Failed to pull latest changes: {output}")
        return False
    
    logging.info("Code updated successfully")
    return True

def run_tests(repo_dir):
    """Run the test suite."""
    logging.info("Running tests...")
    success, output = run_command("python -m pytest", cwd=repo_dir)
    if not success:
        logging.error(f"Tests failed: {output}")
        return False
    
    logging.info("All tests passed!")
    return True

def deploy(repo_dir):
    """Deploy the application."""
    logging.info("Deploying application...")
    success, output = run_command("./deploy.sh", cwd=repo_dir)
    if not success:
        logging.error(f"Deployment failed: {output}")
        return False
    
    logging.info("Deployment successful!")
    return True

def main():
    args = parse_args()
    setup_logging(args.verbose)
    
    # Ensure the repository directory exists
    if not os.path.isdir(args.repo_dir):
        logging.error(f"Directory not found: {args.repo_dir}")
        return 1
    
    # Update the code
    if not update_code(args.repo_dir, args.branch):
        return 2
    
    # Run tests
    if not run_tests(args.repo_dir):
        return 3
    
    # Deploy if requested and tests passed
    if args.deploy:
        if not deploy(args.repo_dir):
            return 4
    else:
        logging.info("Skipping deployment (use --deploy to deploy)")
    
    logging.info("All tasks completed successfully")
    return 0

if __name__ == "__main__":
    sys.exit(main())

Web API Script

This script interacts with a web API and processes the results:

#!/usr/bin/env python3
"""
Fetch weather data from an API and display a forecast.
"""
import argparse
import requests
import json
import sys
from datetime import datetime

def parse_args():
    parser = argparse.ArgumentParser(description='Display weather forecast for a location')
    parser.add_argument('location', help='City name or postal code')
    parser.add_argument('--api-key', help='API key (or set WEATHER_API_KEY env var)')
    parser.add_argument('--days', type=int, default=3, help='Number of days to forecast')
    parser.add_argument('--output', choices=['text', 'json'], default='text', 
                        help='Output format')
    return parser.parse_args()

def get_api_key(args):
    """Get API key from args or environment variable."""
    import os
    if args.api_key:
        return args.api_key
    
    api_key = os.environ.get('WEATHER_API_KEY')
    if not api_key:
        sys.stderr.write("Error: API key not provided. Use --api-key or set WEATHER_API_KEY environment variable.\n")
        sys.exit(1)
    
    return api_key

def fetch_weather(location, api_key, days=3):
    """Fetch weather data from the API."""
    url = "https://api.example.com/weather"
    params = {
        'location': location,
        'days': days,
        'key': api_key
    }
    
    try:
        response = requests.get(url, params=params)
        response.raise_for_status()  # Raise exception for 4XX/5XX responses
        return response.json()
    except requests.exceptions.RequestException as e:
        sys.stderr.write(f"Error fetching weather data: {e}\n")
        sys.exit(2)

def format_text_output(data):
    """Format weather data as human-readable text."""
    location = data['location']['name']
    country = data['location']['country']
    current = data['current']
    forecast = data['forecast']['forecastday']
    
    output = [
        f"Weather for {location}, {country}",
        f"Current: {current['temp_c']}°C, {current['condition']['text']}",
        "\nForecast:",
    ]
    
    for day in forecast:
        date = datetime.strptime(day['date'], '%Y-%m-%d').strftime('%a, %b %d')
        output.append(f"  {date}: {day['day']['avgtemp_c']}°C, {day['day']['condition']['text']}")
    
    return '\n'.join(output)

def main():
    args = parse_args()
    api_key = get_api_key(args)
    
    # Fetch the weather data
    data = fetch_weather(args.location, api_key, args.days)
    
    # Output the data in the requested format
    if args.output == 'json':
        print(json.dumps(data, indent=2))
    else:
        print(format_text_output(data))
    
    return 0

if __name__ == "__main__":
    sys.exit(main())

Best Practices for Python Scripts

Script Structure

Shebang line: Start with #!/usr/bin/env python3 for Unix compatibility
Docstring: Include a module-level docstring that explains the script's purpose
Imports: Place imports at the top of the file, grouped by standard library, third-party, and local modules
Functions: Break the script into logical functions with their own docstrings
Main entry point: Use the if __name__ == "__main__" pattern
Exit codes: Return meaningful exit codes from the main function

Error Handling

Input validation: Validate all inputs, especially user-provided data
Specific exceptions: Catch specific exceptions rather than using bare except clauses
Meaningful errors: Provide clear error messages that help the user understand what went wrong
Resource cleanup: Use with statements to ensure resources are released even if errors occur

Documentation

Script header: Include author, date, purpose, and usage information
Function docstrings: Document what each function does, its parameters, and return value
Complex logic: Explain any complex or non-obvious code with comments
Usage examples: Include examples in the docstring or a separate section

Testing and Debugging

Testable code: Write code that can be unit tested
Verbose mode: Include an option for verbose output to help with debugging
Logging: Use the logging module instead of print for more control over output
Dry run mode: For scripts that modify data, consider adding a "dry run" option that shows what would happen without making changes

Performance and Scalability

Memory efficiency: Process large data incrementally rather than loading it all into memory
Progress feedback: For long-running tasks, provide progress updates
Resource limitations: Be aware of system limitations (file descriptors, memory, etc.)
Parallel processing: Use multithreading or multiprocessing for CPU or I/O-bound tasks

Practical Exercises

Exercise 1: Basic Script Creation

Create a script that asks the user for their name and age
Calculate how many days they have been alive (approximately)
Tell them how old they will be in 2030
Make the script executable (on Unix systems) or runnable from the command line

Exercise 2: Command-Line Tool

Create a command-line tool that counts words, lines, and characters in a text file
Use argparse to handle command-line options
Add options to count only words, only lines, or only characters
Add a option to exclude common words (e.g., "the", "and", "a")
Make the script work with multiple input files

Exercise 3: Data Processing Script

Create a script that reads a CSV file containing data (e.g., sales records, student grades)
Process the data (calculate totals, averages, etc.)
Generate a report in either text or HTML format
Add command-line options to filter the data
Handle errors gracefully (file not found, invalid data, etc.)

Wrapping Up and Next Steps

Today we've covered the essentials of running Python scripts, from basic execution to advanced techniques and best practices. You now have the knowledge to create robust, reusable, and maintainable Python scripts for a wide variety of tasks.

Key Takeaways

Python scripts are saved text files with a .py extension that contain executable Python code
Scripts can be run from the command line, IDEs, or made directly executable
Command-line arguments make scripts flexible and reusable
The if __name__ == "__main__" pattern allows code to be both importable and executable
Well-structured scripts are modular, testable, and handle errors gracefully
Real-world scripts often involve data processing, automation, or API interactions

Where to Go from Here

Practice creating scripts for your own common tasks and workflows
Explore additional standard library modules that help with script creation (e.g., pathlib, csv, json)
Learn about packaging and distribution to share your scripts with others
Dive into testing frameworks to ensure your scripts work correctly
Explore automation tools that can run your scripts on schedules or in response to events

Additional Resources

In our next session, we'll build on these concepts as we explore Python's data structures and how to effectively work with them in your scripts and applications.