陈彦吉的第二次作业

news/2025/3/11 0:31:37/文章来源:https://www.cnblogs.com/BlueSky295/p/18493767

这个作业属于哪个课程	https://edu.cnblogs.com/campus/zjlg/rjjc
这个作业的目标	实现一个命令行文本计数统计程序，能正确统计导入的txt文本中的字符数，单词数等数据
姓名-学号	`陈彦吉` `2022329301139`

作业码云地址: https://gitee.com/BlueSky295/STFB

第二次作业感悟

这学期正好在上吕欣欣老师的Python程序设计课（强烈推荐欣欣姐的Python课！！！），于是趁热打铁使用Python进行全流程开发，同时也能巩固一下自己的Python知识。
由于是第一次使用Gitee（之前完全没接触过），正所谓万事开头难，在网站上找了很多关于Gitee+VS Code的配置教程来看，总算是摸清了一点点Gitee的基本使用机制了，在这里吹一下VS Code，实在是太好用了，不过最后的Perfomance Test还是借助了PyCharm Professitonal的Profile工具，说明这俩IDE也算是各有千秋吧。
通过这次作业还是学习到了不少新鲜的知识和技能，虽然过程也很痛苦就是了😭，不过总而言之还是很庆幸能够顺利完成这个小项目的开发。😊

README

项目简介

这是一个能够实现命令行文本计数统计的python程序项目。

基础功能：能正确统计导入的纯英文txt文本中的字符数，单词数，句子数。
扩展功能：能正确统计导入的程序文件（支持python、C、C++、Java、Javascript多种语言）中的代码行、空行、注释行等，并提供相应命令接口。

文件列表说明

├───v0.1 							# 空项目
│   │   
│   └─── v0.1.py
│               
├───v0.2 							# 项目完成基础功能
│   │   SONG.txt 					# 测试文件
│   │   v0.2.py						# 主程序
│   │   基础功能测试结果.png
│   │   记事本计算结果比对.png
│   │   异常捕获.png
│   │ 
│   └───v0.2单元测试                 # 单元测试
│           v0.2基本功能测试.md     
│           
│           
├───v0.3 							# 项目完成扩展功能
│   │   Test.py 					# 测试文件
│   │   v0.3.py	 					# 主程序 
│   │   扩展功能测试结果.png
│   │   VS Code Counter计算结果比对.png
│   │   异常捕获.png
│   │   
│   └───v0.3单元测试
│            v0.3扩展功能测试.md
│           
│           
├───Performance Test				# 性能分析
│   ├───v0.2PerformanceTest         # v0.2项目Performance Test结果
│   │       v0.2Statistics
│   │       v0.2Call Graph
│   │       v0.2PerformanceTest.pstat
│   │      
│   └───v0.3PerformanceTest         # v0.3项目Performance Test结果
│           v0.Statistics
│           v0.3Call Graph
│           v0.3PerformanceTest.pstat

使用方法说明

基础功能指令表

python v0.x.py -c file.txt # 统计字符数
python v0.x.py -w file.txt # 统计单词数
python v0.x.py -l file.txt # 统计句子数

其中v0.x对应的版本号为v0.2和v0.3

扩展功能指令表

python v0.3.py -C [程序文件名] # 统计程序文件代码行数
python v0.3.py -E [程序文件名] # 统计程序文件空行数
python v0.3.py -M [程序文件名] # 统计程序文件注释行数

基础功能展示

作为测试用的SONG.txt文本展示如下（以下片段节选自英国诗人William Blake的 Poetical Sketches ）

When early morn walks forth in sober grey;
Then to my black ey'd maid I haste away,
When evening sits beneath her dusky bow'r,
And gently sighs away the silent hour;
The village bell alarms, away I go;
And the vale darkens at my pensive woe.To that sweet village, where my black ey'd maid
Doth drop a tear beneath the silent shade,
I turn my eyes; and, pensive as I go,
Curse my black stars, and bless my pleasing woe.Oft when the summer sleeps among the trees,
Whisp'ring faint murmurs to the scanty breeze,
I walk the village round; if at her side
A youth doth walk in stolen joy and pride,
I curse my stars in bitter grief and woe,
That made my love so high, and me so low.O should she e'er prove false, his limbs I'd tear,
And throw all pity on the burning air;
I'd curse bright fortune for my mixed lot,
And then I'd die in peace, and be forgot.

Input:

python v0.2.py -c SONG.txt
python v0.2.py -w SONG.txt
python v0.2.py -l SONG.txt

Output:

William Blake诗歌中的字符数: 857
William Blake诗歌中的单词数: 176
William Blake诗歌中的句子数: 5

结果比对

做作业时偶然发现记事本具有计数功能(左下角实时显示)，虽然只能计算字符数，但也刚好可以用来检验v0.2项目中v0.2.py程序的正确性

基础功能测试结果

输入图片说明

扩展功能展示

作为测试用的Test.py程序文件展示如下(文件夹中翻出来的，不记得来源了💦💦💦)

import pandas as pd
import pulp as pp
from tqdm import tqdmdata_path = 'sc60.csv'try:hourly_prediction = pd.read_csv(data_path, encoding='GBK')print("Data loaded successfully using GBK encoding.")
except Exception as e:print("Failed to read the CSV file with GBK encoding:", e)try:hourly_prediction = pd.read_csv(data_path, encoding='utf-8-sig')print("Data loaded successfully using utf-8-sig encoding.")except Exception as e:print("Failed to read the CSV file with utf-8-sig encoding:", e)# 设置列名和日期处理
hourly_prediction.columns = ['分拣中心', '日期', '班次', '预测货量']
hourly_prediction['日期'] = pd.to_datetime(hourly_prediction['日期'])# 仅对分拣中心SC60的数据进行操作
sc60_data = hourly_prediction[hourly_prediction['分拣中心'] == 'SC60']def solve_optimization(center, shifts_data):# 创建优化问题mylp = pp.LpProblem("Staffing Optimization", pp.LpMinimize)# 获取日期范围和班次dates = sorted(shifts_data['日期'].unique())shifts = shifts_data['班次'].unique()# 定义决策变量x = {(shift, date): pp.LpVariable(f"正式工_{shift}_{date.strftime('%Y%m%d')}", lowBound=0, cat="Integer")for date in dates for shift in shifts}y = {(shift, date): pp.LpVariable(f"临时工_{shift}_{date.strftime('%Y%m%d')}", lowBound=0, cat="Integer")for date in dates for shift in shifts}# 最小化总员工数mylp += pp.lpSum(x[shift, date] + y[shift, date] for shift in shifts for date in dates)# 满足需求的约束for date in dates:for shift in shifts:demand = shifts_data.loc[(shifts_data['日期'] == date) & (shifts_data['班次'] == shift), '预测货量'].item()mylp += 25 * x[shift, date] + 20 * y[shift, date] >= demand# 正式工总数约束mylp += pp.lpSum(x[shift, date] for shift in shifts for date in dates) <= 200# 求解问题mylp.solve(pp.PULP_CBC_CMD(msg=True))# 收集结果results = [{'分拣中心': center,'日期': date.strftime('%Y-%m-%d'),'班次': shift,'正式工人数': pp.value(x[shift, date]),'临时工人数': pp.value(y[shift, date])} for date in dates for shift in shifts]return results# 对SC60分拣中心应用优化
results = []
for date in tqdm(sorted(sc60_data['日期'].unique()), desc="Optimizing SC60"):date_data = sc60_data[sc60_data['日期'] == date]results.extend(solve_optimization('SC60', date_data))# 将结果列表保存
results_df = pd.DataFrame(results)
results_df.to_csv('optimized_staff_schedule_SC60.csv', index=False)
print("Optimization complete for SC60. Results saved to 'optimized_staff_schedule_SC60.csv'.")

Input:

python v0.3.py -C Test.py
python v0.3.py -E Test.py 
python v0.3.py -M Test.py

Output

代码行数: 47
空行数: 14
注释行数: 12

结果比对

VS Code中有个名为VS Code Counter的扩展工具，用于统计代码行数以及代码量等信息，正好可以用来检验我们v0.3.py程序的准确性。

扩展功能测试结果

VS Code Counter计算结果比对

Performance Test

其实我本人最喜欢用的IDE是VS Code，包括本次项目的所有代码编写均采用VS Code完成，但是网上找了好久也没找到简单实用的性能测试插件及工具，所以最后选择了PyCharm Professional（Community版没有）内置的一个强大的图形化性能测试工具称为Profile，使用该工具，可以进行以下任务：