Session Overview
Welcome to our deep dive into running Python scripts! While the REPL is excellent for exploration and experimentation, most real Python development happens in script files. Today, we'll explore various ways to execute Python scripts, pass arguments to them, manage their execution environment, and incorporate them into larger systems. These skills form the foundation of practical Python programming.
Understanding Python Scripts
Python scripts are text files containing Python code that can be executed as a complete program. Unlike interactive REPL sessions, scripts allow you to save your code, run it repeatedly, automate tasks, and build larger applications.
What Makes a Python Script
- File extension: Python scripts typically use the
.pyextension - Executable code: Contains Python code that runs from top to bottom
- Reusability: Can be run repeatedly with the same or different inputs
- Modularity: Can be imported into other scripts or the REPL
Creating Your First Script
Let's create a simple "Hello World" script:
- Open a text editor (VS Code, Sublime Text, Notepad++, etc.)
- Create a new file called
hello_world.py - Add the following code:
# This is a comment in Python
print("Hello, World!")
print("Welcome to Python programming!")
# Variables and simple calculation
name = "Python Learner"
experience_years = 5
print(f"{name} has {experience_years} years of programming experience.")
print(f"In 2 more years, they will have {experience_years + 2} years of experience.")
This simple script demonstrates several key concepts:
- Comments (lines starting with #)
- Print statements for output
- Variable declarations and usage
- String formatting with f-strings
- Basic arithmetic operations
Analogy: Scripts vs. Interactive Sessions
Think of the difference between Python scripts and REPL sessions like the difference between writing a letter and having a conversation:
- REPL (Conversation): Immediate back-and-forth, good for exploration and quick questions, but ephemeral
- Script (Letter): Carefully crafted, can be reviewed and edited before "sending," permanently recorded, can be referenced later
Just as you would choose a letter for important, reusable communication and a conversation for exploration, you choose between scripts and REPL based on your programming needs.
Basic Ways to Run Python Scripts
Method 1: Command Line Execution
The most common way to run a Python script is from the command line:
# On systems with Python as the default interpreter
python hello_world.py
# On systems with both Python 2 and 3 installed
python3 hello_world.py
This invokes the Python interpreter and passes your script file as an argument. The interpreter reads the file, compiles it to bytecode (an intermediate representation), and then executes it.
Method 2: Integrated Development Environments (IDEs)
Most Python IDEs provide a "Run" button or keyboard shortcut to execute the current script:
- VS Code: Press F5 or use the Run button
- PyCharm: Right-click in the editor and select "Run" or press Shift+F10
- IDLE: Press F5 or use the Run menu
IDEs often provide additional features such as:
- Integrated terminal output
- Debugging capabilities
- Variable inspection
- Performance profiling
Method 3: File Explorer (Windows)
On Windows, if Python is correctly associated with .py files, you can double-click a Python script in File Explorer to run it. However, this method has limitations:
- The console window may close immediately after execution
- You cannot easily provide command-line arguments
- This method is not suitable for scripts that require user input
For scripts that need to stay open after execution on Windows, add this at the end:
input("Press Enter to exit...")
Advanced Script Execution Modes
Making Scripts Executable (Unix/Linux/macOS)
On Unix-based systems, you can make Python scripts directly executable:
- Add a shebang line at the top of your script:
#!/usr/bin/env python3
print("This script is directly executable!")
The shebang line (#!/usr/bin/env python3) tells the system which interpreter to use for executing the script.
- Make the script executable using chmod:
chmod +x my_script.py
- Run the script directly:
./my_script.py
This approach is common in system automation and DevOps workflows.
Running as a Module
Python can run scripts as modules using the -m flag:
python -m my_module
This is different from direct execution in several ways:
- Python adds the current directory to
sys.path - The module's
__name__is set to__main__ - You don't need to include the
.pyextension - The module must be importable (e.g., valid Python package structure)
This approach is commonly used for built-in modules with runnable functionality:
# Run the HTTP server module
python -m http.server 8000
# Run the unit test discovery module
python -m unittest discover
Interactive Mode with Scripts
You can run a script and then drop into an interactive session using the -i flag:
python -i my_script.py
This executes the script and then starts the REPL with all the script's variables and functions available for interactive use. This is extremely useful for debugging and exploring the state after script execution.
Command-Line Arguments
Command-line arguments allow users to provide input to scripts at runtime, making them more flexible and reusable.
Basic Argument Handling with sys.argv
The simplest way to handle command-line arguments is using the sys.argv list:
import sys
# sys.argv[0] is the script name
# sys.argv[1:] are the arguments passed to the script
if len(sys.argv) > 1:
name = sys.argv[1]
print(f"Hello, {name}!")
else:
print("Hello, stranger! Please provide your name as an argument.")
Save this as greet.py and run it with:
python greet.py Alice
The output will be:
Hello, Alice!
Advanced Argument Parsing with argparse
For more complex argument handling, use the argparse module from the standard library:
import argparse
# Create an argument parser
parser = argparse.ArgumentParser(description='A greeting script with options.')
# Add arguments
parser.add_argument('name', help='Name of the person to greet')
parser.add_argument('--title', '-t', help='Title for the person')
parser.add_argument('--repeat', '-r', type=int, default=1, help='Number of times to repeat the greeting')
# Parse arguments
args = parser.parse_args()
# Use the arguments
greeting = f"Hello"
if args.title:
greeting += f", {args.title}"
greeting += f" {args.name}!"
for _ in range(args.repeat):
print(greeting)
Save this as advanced_greet.py and run it with various arguments:
python advanced_greet.py Alice --title Dr. --repeat 3
python advanced_greet.py Bob -t Mr. -r 2
python advanced_greet.py --help
The argparse module provides many benefits:
- Automatic help message generation
- Type conversion and validation
- Short and long argument formats
- Required vs. optional arguments
- Default values
Analogy: Command-Line Arguments as Function Parameters
Command-line arguments are like parameters to a function:
- They allow you to pass data into your script
- They can have default values
- They can be required or optional
- They can be validated or converted to specific types
Just as a well-designed function has clear parameters, a well-designed script has clear command-line arguments that make it flexible and reusable.
Script Execution Environment
Environment Variables
Scripts can access environment variables to configure their behavior:
import os
# Access environment variables
db_url = os.environ.get('DATABASE_URL', 'sqlite:///default.db')
debug_mode = os.environ.get('DEBUG', 'False').lower() == 'true'
print(f"Database URL: {db_url}")
print(f"Debug mode: {debug_mode}")
# For development, you can set environment variables before running
# export DATABASE_URL="postgresql://user:pass@localhost/mydb"
# export DEBUG="True"
This approach allows you to change script behavior without modifying code, which is especially useful for:
- Different deployment environments (development, testing, production)
- Sensitive information (API keys, passwords)
- User-specific configuration
Working Directory and File Paths
Scripts often need to work with files in specific locations:
import os
# Get the current working directory
current_dir = os.getcwd()
print(f"Current directory: {current_dir}")
# Get the directory containing the script
script_dir = os.path.dirname(os.path.abspath(__file__))
print(f"Script directory: {script_dir}")
# Construct paths relative to the script
data_path = os.path.join(script_dir, 'data', 'input.csv')
print(f"Data file path: {data_path}")
# Check if a file exists
if os.path.exists(data_path):
print(f"Data file exists: {data_path}")
else:
print(f"Data file does not exist: {data_path}")
Using __file__ to find the script directory makes your code more robust, as it works regardless of the current working directory when the script is launched.
Exit Codes
Scripts can communicate their execution status through exit codes:
import sys
def process_data(filename):
try:
with open(filename, 'r') as f:
# Process the file...
print(f"Successfully processed {filename}")
return True
except FileNotFoundError:
print(f"Error: File not found: {filename}")
return False
except Exception as e:
print(f"Error processing file: {e}")
return False
if __name__ == "__main__":
if len(sys.argv) < 2:
print("Error: Please provide a filename")
sys.exit(1) # Exit with error code 1
filename = sys.argv[1]
success = process_data(filename)
if success:
sys.exit(0) # Exit with success code 0
else:
sys.exit(2) # Exit with error code 2
Exit codes are important for:
- Scripts called from other programs or scripts
- Batch processing and automation
- Error handling in shell scripts
By convention, exit code 0 indicates success, while any non-zero value indicates an error.
Script Modularity and Reusability
The __name__ == "__main__" Pattern
A common pattern in Python scripts is the if __name__ == "__main__": check:
# math_utils.py
def add(a, b):
"""Add two numbers and return the result."""
return a + b
def multiply(a, b):
"""Multiply two numbers and return the result."""
return a * b
# This block only runs when the script is executed directly
if __name__ == "__main__":
print("Testing math utilities:")
print(f"5 + 3 = {add(5, 3)}")
print(f"4 * 6 = {multiply(4, 6)}")
# You could also add command-line parsing here
# import sys
# a = int(sys.argv[1])
# b = int(sys.argv[2])
# print(f"{a} + {b} = {add(a, b)}")
# print(f"{a} * {b} = {multiply(a, b)}")
This pattern provides dual functionality:
- When run as a script (
python math_utils.py), the test code executes - When imported as a module (
import math_utils), only the functions are defined, but the test code doesn't run
This makes your code both executable and importable, which is a cornerstone of Python's reusability.
Creating Executable Modules
You can structure a Python package to be both importable and executable:
# my_package/__main__.py
"""
This file makes the package directly executable with:
python -m my_package
"""
from .core import main
if __name__ == "__main__":
main()
# my_package/core.py
def main():
"""Main function implementing the core functionality."""
print("Running the main package functionality!")
# ... actual code here ...
def helper_function():
"""A helper function used by main()."""
return "Helper result"
This structure allows for:
- Running as a module:
python -m my_package - Importing specific functions:
from my_package.core import helper_function - Clean separation between execution logic and core functionality
Organizing Larger Scripts
As scripts grow, organize them into functions with a clear entry point:
#!/usr/bin/env python3
"""
A data processing script that demonstrates good organization.
"""
import argparse
import logging
import os
import sys
def setup_logging(verbose=False):
"""Configure logging based on verbosity level."""
level = logging.DEBUG if verbose else logging.INFO
logging.basicConfig(level=level, format='%(levelname)s: %(message)s')
def parse_arguments():
"""Parse and return command-line arguments."""
parser = argparse.ArgumentParser(description="Process data files.")
parser.add_argument('input', help='Input file path')
parser.add_argument('output', help='Output file path')
parser.add_argument('-v', '--verbose', action='store_true', help='Enable verbose output')
return parser.parse_args()
def read_data(input_path):
"""Read and parse the input data file."""
logging.info(f"Reading data from {input_path}")
try:
with open(input_path, 'r') as f:
return f.readlines()
except Exception as e:
logging.error(f"Failed to read input file: {e}")
sys.exit(1)
def process_data(data):
"""Process the input data and return the results."""
logging.info(f"Processing {len(data)} lines of data")
# ... processing logic here ...
return [line.upper() for line in data] # Example: convert to uppercase
def write_results(output_path, results):
"""Write the processed results to the output file."""
logging.info(f"Writing results to {output_path}")
try:
with open(output_path, 'w') as f:
f.writelines(results)
except Exception as e:
logging.error(f"Failed to write output file: {e}")
sys.exit(2)
def main():
"""Main entry point for the script."""
args = parse_arguments()
setup_logging(args.verbose)
logging.debug("Starting data processing job")
data = read_data(args.input)
results = process_data(data)
write_results(args.output, results)
logging.info("Processing completed successfully")
return 0
if __name__ == "__main__":
sys.exit(main())
Benefits of this organization:
- Each function has a single responsibility
- Clear entry point through
main() - Proper error handling and logging
- Testable components
- Exit code management
Analogy: Well-Structured Scripts as Recipes
A well-structured script is like a professional recipe:
- Ingredients (arguments, inputs) are clearly listed at the beginning
- Each step (function) has a specific purpose and clear instructions
- Steps are performed in a logical order
- The recipe can be scaled or adapted for different situations
- Experienced chefs (developers) can reuse components in other recipes
Just as a good recipe is easy to follow and adapt, a well-structured script is easy to understand and maintain.
Real-World Script Examples
Data Processing Script
This script processes CSV data, a common task in data analysis:
#!/usr/bin/env python3
"""
Process sales data to generate a summary report.
"""
import csv
import argparse
from collections import defaultdict
from datetime import datetime
def parse_args():
parser = argparse.ArgumentParser(description='Generate sales report from CSV data')
parser.add_argument('input_file', help='Input CSV file path')
parser.add_argument('output_file', help='Output report file path')
parser.add_argument('--year', type=int, help='Filter by year')
return parser.parse_args()
def process_sales_data(input_file, year_filter=None):
sales_by_region = defaultdict(float)
sales_by_product = defaultdict(float)
total_sales = 0.0
with open(input_file, 'r', newline='') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
# Parse the date
date = datetime.strptime(row['date'], '%Y-%m-%d')
# Apply year filter if specified
if year_filter and date.year != year_filter:
continue
# Extract data
region = row['region']
product = row['product']
amount = float(row['amount'])
# Update our aggregations
sales_by_region[region] += amount
sales_by_product[product] += amount
total_sales += amount
return {
'total_sales': total_sales,
'sales_by_region': sales_by_region,
'sales_by_product': sales_by_product,
}
def write_report(output_file, data):
with open(output_file, 'w') as f:
f.write("SALES REPORT\n")
f.write("=" * 40 + "\n\n")
f.write(f"Total Sales: ${data['total_sales']:.2f}\n\n")
f.write("Sales by Region:\n")
for region, amount in sorted(data['sales_by_region'].items()):
f.write(f" {region}: ${amount:.2f}\n")
f.write("\n")
f.write("Sales by Product:\n")
for product, amount in sorted(data['sales_by_product'].items()):
f.write(f" {product}: ${amount:.2f}\n")
def main():
args = parse_args()
data = process_sales_data(args.input_file, args.year)
write_report(args.output_file, data)
print(f"Report written to {args.output_file}")
if __name__ == "__main__":
main()
Automation Script
This script automates a common development workflow:
#!/usr/bin/env python3
"""
Automate the process of updating code, running tests, and deploying if tests pass.
"""
import os
import subprocess
import argparse
import logging
import sys
def setup_logging(verbose=False):
level = logging.DEBUG if verbose else logging.INFO
logging.basicConfig(
level=level,
format='%(asctime)s - %(levelname)s - %(message)s',
datefmt='%Y-%m-%d %H:%M:%S'
)
def parse_args():
parser = argparse.ArgumentParser(description='Automate code update and deployment')
parser.add_argument('repo_dir', help='Repository directory')
parser.add_argument('--branch', default='main', help='Branch to update')
parser.add_argument('--deploy', action='store_true', help='Deploy if tests pass')
parser.add_argument('-v', '--verbose', action='store_true', help='Verbose output')
return parser.parse_args()
def run_command(command, cwd=None):
"""Run a shell command and return its output and status."""
logging.debug(f"Running command: {command}")
try:
result = subprocess.run(
command,
shell=True,
cwd=cwd,
check=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True
)
return True, result.stdout
except subprocess.CalledProcessError as e:
return False, e.stderr
def update_code(repo_dir, branch):
"""Pull the latest code from the repository."""
logging.info(f"Updating code in {repo_dir} (branch: {branch})")
# Ensure we're on the right branch
success, output = run_command(f"git checkout {branch}", cwd=repo_dir)
if not success:
logging.error(f"Failed to checkout branch {branch}: {output}")
return False
# Pull the latest changes
success, output = run_command("git pull", cwd=repo_dir)
if not success:
logging.error(f"Failed to pull latest changes: {output}")
return False
logging.info("Code updated successfully")
return True
def run_tests(repo_dir):
"""Run the test suite."""
logging.info("Running tests...")
success, output = run_command("python -m pytest", cwd=repo_dir)
if not success:
logging.error(f"Tests failed: {output}")
return False
logging.info("All tests passed!")
return True
def deploy(repo_dir):
"""Deploy the application."""
logging.info("Deploying application...")
success, output = run_command("./deploy.sh", cwd=repo_dir)
if not success:
logging.error(f"Deployment failed: {output}")
return False
logging.info("Deployment successful!")
return True
def main():
args = parse_args()
setup_logging(args.verbose)
# Ensure the repository directory exists
if not os.path.isdir(args.repo_dir):
logging.error(f"Directory not found: {args.repo_dir}")
return 1
# Update the code
if not update_code(args.repo_dir, args.branch):
return 2
# Run tests
if not run_tests(args.repo_dir):
return 3
# Deploy if requested and tests passed
if args.deploy:
if not deploy(args.repo_dir):
return 4
else:
logging.info("Skipping deployment (use --deploy to deploy)")
logging.info("All tasks completed successfully")
return 0
if __name__ == "__main__":
sys.exit(main())
Web API Script
This script interacts with a web API and processes the results:
#!/usr/bin/env python3
"""
Fetch weather data from an API and display a forecast.
"""
import argparse
import requests
import json
import sys
from datetime import datetime
def parse_args():
parser = argparse.ArgumentParser(description='Display weather forecast for a location')
parser.add_argument('location', help='City name or postal code')
parser.add_argument('--api-key', help='API key (or set WEATHER_API_KEY env var)')
parser.add_argument('--days', type=int, default=3, help='Number of days to forecast')
parser.add_argument('--output', choices=['text', 'json'], default='text',
help='Output format')
return parser.parse_args()
def get_api_key(args):
"""Get API key from args or environment variable."""
import os
if args.api_key:
return args.api_key
api_key = os.environ.get('WEATHER_API_KEY')
if not api_key:
sys.stderr.write("Error: API key not provided. Use --api-key or set WEATHER_API_KEY environment variable.\n")
sys.exit(1)
return api_key
def fetch_weather(location, api_key, days=3):
"""Fetch weather data from the API."""
url = "https://api.example.com/weather"
params = {
'location': location,
'days': days,
'key': api_key
}
try:
response = requests.get(url, params=params)
response.raise_for_status() # Raise exception for 4XX/5XX responses
return response.json()
except requests.exceptions.RequestException as e:
sys.stderr.write(f"Error fetching weather data: {e}\n")
sys.exit(2)
def format_text_output(data):
"""Format weather data as human-readable text."""
location = data['location']['name']
country = data['location']['country']
current = data['current']
forecast = data['forecast']['forecastday']
output = [
f"Weather for {location}, {country}",
f"Current: {current['temp_c']}°C, {current['condition']['text']}",
"\nForecast:",
]
for day in forecast:
date = datetime.strptime(day['date'], '%Y-%m-%d').strftime('%a, %b %d')
output.append(f" {date}: {day['day']['avgtemp_c']}°C, {day['day']['condition']['text']}")
return '\n'.join(output)
def main():
args = parse_args()
api_key = get_api_key(args)
# Fetch the weather data
data = fetch_weather(args.location, api_key, args.days)
# Output the data in the requested format
if args.output == 'json':
print(json.dumps(data, indent=2))
else:
print(format_text_output(data))
return 0
if __name__ == "__main__":
sys.exit(main())
Best Practices for Python Scripts
Script Structure
- Shebang line: Start with
#!/usr/bin/env python3for Unix compatibility - Docstring: Include a module-level docstring that explains the script's purpose
- Imports: Place imports at the top of the file, grouped by standard library, third-party, and local modules
- Functions: Break the script into logical functions with their own docstrings
- Main entry point: Use the
if __name__ == "__main__"pattern - Exit codes: Return meaningful exit codes from the main function
Error Handling
- Input validation: Validate all inputs, especially user-provided data
- Specific exceptions: Catch specific exceptions rather than using bare
exceptclauses - Meaningful errors: Provide clear error messages that help the user understand what went wrong
- Resource cleanup: Use
withstatements to ensure resources are released even if errors occur
Documentation
- Script header: Include author, date, purpose, and usage information
- Function docstrings: Document what each function does, its parameters, and return value
- Complex logic: Explain any complex or non-obvious code with comments
- Usage examples: Include examples in the docstring or a separate section
Testing and Debugging
- Testable code: Write code that can be unit tested
- Verbose mode: Include an option for verbose output to help with debugging
- Logging: Use the
loggingmodule instead ofprintfor more control over output - Dry run mode: For scripts that modify data, consider adding a "dry run" option that shows what would happen without making changes
Performance and Scalability
- Memory efficiency: Process large data incrementally rather than loading it all into memory
- Progress feedback: For long-running tasks, provide progress updates
- Resource limitations: Be aware of system limitations (file descriptors, memory, etc.)
- Parallel processing: Use multithreading or multiprocessing for CPU or I/O-bound tasks
Practical Exercises
Exercise 1: Basic Script Creation
- Create a script that asks the user for their name and age
- Calculate how many days they have been alive (approximately)
- Tell them how old they will be in 2030
- Make the script executable (on Unix systems) or runnable from the command line
Exercise 2: Command-Line Tool
- Create a command-line tool that counts words, lines, and characters in a text file
- Use
argparseto handle command-line options - Add options to count only words, only lines, or only characters
- Add a option to exclude common words (e.g., "the", "and", "a")
- Make the script work with multiple input files
Exercise 3: Data Processing Script
- Create a script that reads a CSV file containing data (e.g., sales records, student grades)
- Process the data (calculate totals, averages, etc.)
- Generate a report in either text or HTML format
- Add command-line options to filter the data
- Handle errors gracefully (file not found, invalid data, etc.)
Wrapping Up and Next Steps
Today we've covered the essentials of running Python scripts, from basic execution to advanced techniques and best practices. You now have the knowledge to create robust, reusable, and maintainable Python scripts for a wide variety of tasks.
Key Takeaways
- Python scripts are saved text files with a
.pyextension that contain executable Python code - Scripts can be run from the command line, IDEs, or made directly executable
- Command-line arguments make scripts flexible and reusable
- The
if __name__ == "__main__"pattern allows code to be both importable and executable - Well-structured scripts are modular, testable, and handle errors gracefully
- Real-world scripts often involve data processing, automation, or API interactions
Where to Go from Here
- Practice creating scripts for your own common tasks and workflows
- Explore additional standard library modules that help with script creation (e.g.,
pathlib,csv,json) - Learn about packaging and distribution to share your scripts with others
- Dive into testing frameworks to ensure your scripts work correctly
- Explore automation tools that can run your scripts on schedules or in response to events
Additional Resources
- Official Python Documentation: argparse module
- Official Python Documentation: __main__ — Top-level script environment
- Real Python: Command-Line Arguments in Python
- Real Python: Python Application Layouts
- Click: A Python package for creating command-line interfaces
In our next session, we'll build on these concepts as we explore Python's data structures and how to effectively work with them in your scripts and applications.