目录:ASP.NET MVC企业级实战目录

比如说www.verycd.com、博客园、淘宝、京东且有落实站内搜索功能,站内搜索无论在性质和用户体验及都深不错,本节,通过下Lucene.Net来实现站内搜索。

以身作则效果预览如下图10-22~10-24所示。

图片 1

图10-22

 图片 2

图10-23

 图片 3

图10-24

于10.4节约,已经完结了查找的率先个版,但是还有不少地方需要优化。比如说,我只要统计要词搜索的频率高之词,也就是热词,以及诸如百度搜索那样,在输入关键字后,会自动把搜索相关的热词自动以下拉列表的花样带出来。还有诸如搜索结果分页,查看文章明细等。

1.数目对象

1.1结构

  • vector。具有同样档次的数目的聚众,默认为列向量。Factor也是异常向量。
  • matrix。组织多独有着同样档次的向量。列为变量(实例),行为相(因子)
  • array数组。多摆设二维表的集结
  • dataframe.及矩阵类似,但是储存类型不同之变量。
  • list列表。多个向量、矩阵、数组、数据库的集纳。用于将相关统计分析结果“打包”

图片 4

10.5.1 热词统计

思路:

1、 
首先,我们脑海里而旗帜鲜明一点:搜索关键字的统计,实时性是不强之。也就是说我们得定期的夺开展统计。

2、  客户之各级一样破搜索记录,我们都急需存起来,这样才会统计得到。

起第1触及,我们脑海中虽见面展现均等摆设汇聚总统计表,从第2点被,我们见面想到利用相同摆放寻找记录明细表。那方案就是不行懂得了了,只待定期的起明细表中Group
by查询,然后拿询问结构放到汇总表中。怎么放汇总表中?是直Update更新也?其实我们得有重快速的方式,那便是针对集中表先进行truncate,然后又拓展insert操作。

表明10-1 搜索汇总统计表SearchTotals

字段名称

字段类型

说明

Id

char(36)

主键,采用Guid方式存储

KeyWords

nvarchar(50)

搜索关键字

SearchCounts

int

搜索次数

表10-2 搜索明细表SearchDetails

字段名称

字段类型

说明

Id

char(36)

主键,采用Guid方式存储

KeyWords

nvarchar(50)

搜索关键字

SearchDateTime

datetime

搜索时间

操作步骤:

(1)在Models文件夹着,新建两独类SearchTotal、SearchDetail。

SearchTotal.cs代码:

using System;
using System.ComponentModel.DataAnnotations;

namespace SearchDemo.Models
{
    public class SearchTotal
    {
        public Guid Id { get; set; }
        [StringLength(50)]
        public string KeyWords { get; set; }
        public int SearchCounts { get; set; }
    }
}

SearchDetail.cs代码:

using System;
using System.ComponentModel.DataAnnotations;

namespace SearchDemo.Models
{
    public class SearchDetail
    {
        public Guid Id { get; set; }
        [StringLength(50)]
        public string KeyWords { get; set; }
        public Nullable<DateTime> SearchDateTime { get; set; }
    }
}

(2)修改SearchDemoContext类,新增了性能SearchTotal、SearchDetail。

using System.Data.Entity;

namespace SearchDemo.Models
{
    public class SearchDemoContext : DbContext
    {
        public SearchDemoContext() : base("name=SearchDemoContext") { }
        public DbSet<Article> Article { get; set; }
        //下面两个属性是新增加的
        public DbSet<SearchTotal> SearchTotal { get; set; }
        public DbSet<SearchDetail> SearchDetail { get; set; }
    }
}

3)更新数据库

由修改了EF上下文,新增了简单单模型类,所以要开展搬迁更新数据库操作。

以应用程序重新编译,然后选取工具->库程序包管理器->程序包管理控制台。

开辟控制台,输入enable-migrations -force
,然后回车。回车后会见在列项目资源管理器中见面冒出Migrations文件夹,打开Configuration.cs
文件,将AutomaticMigrationsEnabled 值改吗 true,然后在控制台中输入
update-database
运行。操作完后,会以数据库SearchDemo中多新建两摆放表SearchTotals、SearchDetails,而本的Articles表保持不更换。如图10-20所出示。

 图片 5

图10-20

(4)保存搜索记录

用户在历次找的时光,要把搜索记录存入SearchDetails表中。为了好,这里自己是当用户每次点击搜索后就是应声往SearchDetails表中插记录了,也不怕是同步操作,而实质上,如果为了提升查找的频率,我们可使用异步操作,即把搜索记录的多寡先勾勒入redis队列中,后台还开发一个线程来监听redis队列,然后拿班中之寻找记录数据写入到多少表中。因为以每次点击搜索的时刻,我们将记录往redis写及管记录第一手通往关系项目数据库中描写的频率是离开大酷之。

 //先将搜索的词插入到明细表。
            SearchDetail _SearchDetail = new SearchDetail { Id = Guid.NewGuid(), KeyWords = kw, SearchDateTime = DateTime.Now };
            db.SearchDetail.Add(_SearchDetail);
            int r = db.SaveChanges();

(5)定时更新SearchTotals表记录

观这种定时任务操作,这里可以以Quartz.Net框架,为了便利,我将Quartz.Net的Job寄宿在控制台程序中,而事实上工作面临,我虽然再度倾向于用那寄宿在Windows服务受到。如果发必不可少,可以拿这定时更新SearchTotals表记录之次序部署及独门的服务器,这样可以减轻Web服务器的压力。

  1. 新建控制台程序QuartzNet,添加Quartz.dll和Common.Logging.dll的主次集引用,这里运用Database
    First的法子,添加ADO.NET实体数据模型,把表SearchTotals、SearchDetails添加进来。

2.添加KeyWordsTotalService.cs类,里面封装两只章程,清空SearchTotals表,然后将SearchDetails表的分组查询结构插入到SearchTotals表,这里自己特统计近30上内的探寻明细。

namespace QuartzNet
{
    public class KeyWordsTotalService
    {
        private SearchDemoEntities db = new SearchDemoEntities();
        /// <summary>
        /// 将统计的明细表的数据插入。
        /// </summary>
        /// <returns></returns>
        public bool InsertKeyWordsRank()
        {
            string sql = "insert into SearchTotals(Id,KeyWords,SearchCounts) select newid(),KeyWords,count(*)  from SearchDetails where DateDiff(day,SearchDetails.SearchDateTime,
getdate())<=30 group by SearchDetails.KeyWords";
            return this.db.Database.ExecuteSqlCommand(sql) > 0;
        }
        /// <summary>
        /// 删除汇总中的数据。
        /// </summary>
        /// <returns></returns>
        public bool DeleteAllKeyWordsRank()
        {
            string sql = "truncate table SearchTotals";
            return this.db.Database.ExecuteSqlCommand(sql) > 0;
        }
    }
}
  1. 添加TotalJob.cs类,继承Ijob接口,并实现Execute方法。

    namespace QuartzNet
    {

     public class TotalJob : IJob
     {
         /// <summary>
         /// 将明细表中的数据插入到汇总表中。
         /// </summary>
         /// <param name="context"></param>
         public void Execute(JobExecutionContext context)
         {
             KeyWordsTotalService bll = new KeyWordsTotalService();
             bll.DeleteAllKeyWordsRank();
             bll.InsertKeyWordsRank();
         }
     }
    

    }

4.修改Program.cs类

using Quartz;
using Quartz.Impl;
using System;

namespace QuartzNet
{
    class Program
    {
        static void Main(string[] args)
        {
            IScheduler sched;
            ISchedulerFactory sf = new StdSchedulerFactory();
            sched = sf.GetScheduler();
            JobDetail job = new JobDetail("job1", "group1", typeof(TotalJob));//IndexJob为实现了IJob接口的类
            DateTime ts = TriggerUtils.GetNextGivenSecondDate(null, 5);//5秒后开始第一次运行
            TimeSpan interval = TimeSpan.FromSeconds(50);//每隔50秒执行一次
            Trigger trigger = new SimpleTrigger("trigger1", "group1", "job1", "group1", ts, null,
                                                    SimpleTrigger.RepeatIndefinitely, interval);//每若干时间运行一次,时间间隔可以放到配置文件中指定

            sched.AddJob(job, true);
            sched.ScheduleJob(trigger);
            sched.Start();
            Console.ReadKey();
        }
    }
}

此处自己是直拿Job和计划还直接写到代码中了,理由还是因为便宜。而实际上工作吃,我们相应把这些信息尽量写及布置文件被,这样后改动起来方便,不需修改代码,只待改配置文件。

为了抢看到成效,我此是每隔50秒就进展了千篇一律差统计操作,而以事实上使用被,我们的时光间隔或是几独小时甚至同上,因为像这么的非常数量统计,对实时性的求未愈,我们得尽量减少对数据库的IO读写次数。

维持运行控制台程序QuartzNet,然后我们错过开展搜操作,这样后台就定期的变了查找统计记录。

1.2向量

#查看对象结构
str()
#管理对象
ls()
rm()
remove()

  向量的开创

X<-c(1,1,1)
length(X)
Y<-c(2,2,2)
temp<-c(14.7,18.5,25.9)
RH<-c(66,73,41)
wind<-c(2.7,8.5,3.6)
rain<-c(0,0,0)
area<-rain
month<-c("aug","aug","aug")
day<-rep("fri",each=3)
rank<-seq(from=1,to=3,by=1)
str(month)
str(rank)
ls()

  

rep(begin:end,each=repeat times)#每个值重复多少次
rep(begin:end,times=repeat times)#向量重复多少次
seq(from=,to=,by=)

seq(from=,to=,length=)

vector(length=)#包含多少元素

10.5.2 热门搜索

  访问向量

1.访指定位置的元素

向量名[位置变量]
向量名[位置变量1:位置变量2]
向量名[c(位置变量列表)]

  

> a<-vector(length=10)
> a
 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> a[1]<-1
> a[2:4]<-c(2,3,4)
> a
 [1] 1 2 3 4 0 0 0 0 0 0
> b<-seq(from=5,to=9,by=1)
> a[c(5:9,10)]<-c(b,10)#访问5~9和第10个向量,并赋值5~10
> a
 [1]  1  2  3  4  5  6  7  8  9 10

  

2.使职务向量访问指定位置的因素

向量名[位置向量名]

  

> b<-(2:4)
> a[b]
[1] 2 3 4
> b<-c(TRUE,FALSE,FALSE,TRUE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE)
> a[b]
[1] 1 4

  

3.拜访指定位置外的元素

向量名[-位置变量]
向量名[-(位置变量1:位置变量2)]
向量名[-c(位置变量列表)]
向量名[-位置变量名]

  

> a[-(2:4)]
[1]  1  5  6  7  8  9 10
> a[-c(5:9,10)]
[1] 1 2 3 4
> b<-(2:4)
> a[-b]
[1]  1  5  6  7  8  9 10

  

10.5.2.1 展示热门搜索

实质上就是是于表SearchTotals中仍搜索次数进行降序排列,然后取出数条记下而已。

LastSearch控制器中的Index方法吃补充加如下代码:

var keyWords = db.SearchTotal.OrderByDescending(a => a.SearchCounts).Select(x => x.KeyWords).Skip(0).Take(6).ToList();
            ViewBag.KeyWords = keyWords;

View视图中

<div id="divKeyWords">热门搜索:@if (ViewBag.KeyWords != null) {
             foreach (string v in ViewBag.KeyWords) { 
              <a href="#">@v</a>
             }
         }</div>

连片下去,我怀念要促成如下图10-21所出示之功力:

图片 6 

图10-21

当自己点击一个热词的早晚,自动加载到文本框,并点击“搜索”按钮。

于View中添加代码:

<script type="text/javascript">
    $(function () {
        $("#divKeyWords a").click(function () {
            $("#txtSearch").val($(this).html());
            $("#btnSearch").click();
        });
});
</script>

1.3矩阵

联多只向量

#合并列向量
cbind(向量名列表)
#显示列数
dim(矩阵名)
#命名
colnames()
colnames(矩阵名[,列位置常量1:2])
rownames()
rownames(矩阵名[行位置常量1:2,])

  

> ForeData<-cbind(X,Y,temp,RH,wind,rain,area,rank)
> dim(ForeData)
[1] 3 8
> ForeData
     X Y temp RH wind rain area rank
[1,] 1 2 14.7 66  2.7    0    0    1
[2,] 1 2 18.5 73  8.5    0    0    2
[3,] 1 2 25.9 41  3.6    0    0    3
> str(ForeData)
 num [1:3, 1:8] 1 1 1 2 2 2 14.7 18.5 25.9 66 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:8] "X" "Y" "temp" "RH" ...
> colnames(ForeData)
[1] "X"    "Y"    "temp" "RH"   "wind" "rain" "area" "rank"
> colnames(ForeData[,3:5])
[1] "temp" "RH"   "wind"
> rownames(ForeData)<-c("1","2","3")
> rownames(ForeData[c(1,3),])
[1] "1" "3"
> is.matrix(ForeData)
[1] TRUE

  

a<-(1:9)
b<-(1:3)
c<-(1:2)
cbind(a,b)
cbind(a,b,c)
rbind(a,b)#行合并
rm(a,b,c)

 

2.设matrix中之数量已经存在让某向量中,则好拿之为量随一定措施派生为矩阵

matrix(data = NA, nrow = 1, ncol = 1, byrow = FALSE,
       dimnames = list(rownames,colnames))

  

> a<-(1:30)
> dim1<-c("R1","R2","R3","R4","R5")
> dim2<-c("C1","C2","C3","C4","C5","C6")
> a<-matrix(a,nrow=5,ncol=6,byrow=FALSE,dimnames=list(dim1,dim2))
> a
   C1 C2 C3 C4 C5 C6
R1  1  6 11 16 21 26
R2  2  7 12 17 22 27
R3  3  8 13 18 23 28
R4  4  9 14 19 24 29
R5  5 10 15 20 25 30

  访问矩阵中之要素

1.点名位置上之素

矩阵名[行位置常量,列位置常量]
矩阵名{行位置常量1:行位置常量2,列位置常量1:列位置常量2】
矩阵名[c(行位置常量列表),c(列位置常量列表)]

 

> ForeData
  X Y temp RH wind rain area rank
1 1 2 14.7 66  2.7    0    0    1
2 1 2 18.5 73  8.5    0    0    2
3 1 2 25.9 41  3.6    0    0    3
> ForeData[2,3]
[1] 18.5
> ForeData[1:2,1:3]
  X Y temp
1 1 2 14.7
2 1 2 18.5
> a<-(1:2)
> ForeData[a,c(1,3)]
  X temp
1 1 14.7
2 1 18.5
> ForeData[c(1,3),]
  X Y temp RH wind rain area rank
1 1 2 14.7 66  2.7    0    0    1
3 1 2 25.9 41  3.6    0    0    3

  2.采取编辑窗口访问元素

fix(ForeData)

  

 

 

图片 7

 

矩阵计算

1.合并

(m1<-matrix(1,nrow=2,ncol=2))
(m2<-matrix(2,nrow=2,ncol=2))
(mm1<-cbind(m1,m2))
(mm2<-rbind(m1,m2))

  

2.乘法

%*%

(mm3<-mm1%*%mm2)
(mm3<-mm2%*%mm1)

  

3.开立对角矩阵

> diag(8)  #创建8乘8的单位阵
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,]    1    0    0    0    0    0    0    0
[2,]    0    1    0    0    0    0    0    0
[3,]    0    0    1    0    0    0    0    0
[4,]    0    0    0    1    0    0    0    0
[5,]    0    0    0    0    1    0    0    0
[6,]    0    0    0    0    0    1    0    0
[7,]    0    0    0    0    0    0    1    0
[8,]    0    0    0    0    0    0    0    1
> diag(c(1,2,3,4)) 
     [,1] [,2] [,3] [,4]
[1,]    1    0    0    0
[2,]    0    2    0    0
[3,]    0    0    3    0
[4,]    0    0    0    4
> diag(c(1,2,3,4),nrow=3,ncol=4)
     [,1] [,2] [,3] [,4]
[1,]    1    0    0    0
[2,]    0    2    0    0
[3,]    0    0    3    0

  

4.转置&逆

t()
solve()
eigen()#特征向量与特征值

  

10.5.2.2 搜索下拉框

此地自己引入一个老三方js框架Autocomplete,它会当文本框中输入文字的时,自动从后台抓去数下拉列表。

云盘中本人提供了Autocomplete.rar,将那个解压,然后拷贝到SearchDemo项目遭到的lib目录下。

以SearchDemo项目被的KeyWordsTotalService.cs类中补充加计

using System;
using System.Collections.Generic;
using System.Data.SqlClient;
using System.Linq;

namespace SearchDemo.Common
{
    public class KeyWordsTotalService
    {
        private SearchDemoContext db = new SearchDemoContext();

        public List<string> GetSearchMsg(string term)
        {
            try
            {
                //存在SQL注入的安全隐患
                //string sql = "select KeyWords from SearchTotals where KeyWords like '"+term.Trim()+"%'";
                //return db.Database.SqlQuery<string>(sql).ToList();
                string sql = "select KeyWords from SearchTotals where KeyWords like @term";
                return db.Database.SqlQuery<string>(sql, new SqlParameter("@term", term+"%")).ToList();
            }
            catch (Exception ex)
            {
                throw new Exception(ex.Message);
            }
        }
    }
}

接下来于LastSearch控制器中上加计:

     /// <summary>
        /// 获取客户列表 模糊查询
        /// </summary>
        /// <param name="term"></param>
        /// <returns></returns>
        public string GetKeyWordsList(string term)
        {
            if (string.IsNullOrWhiteSpace(term))
                return null;

            var list = new KeyWordsTotalService().GetSearchMsg(term);
            //序列化对象
            //尽量不要用JavaScriptSerializer,为什么?性能差,完全可用Newtonsoft.Json来代替
            //System.Web.Script.Serialization.JavaScriptSerializer js = new System.Web.Script.Serialization.JavaScriptSerializer();
            //return js.Serialize(list.ToArray());
            return JsonConvert.SerializeObject(list.ToArray());
        }

俺们来拘禁View:

<link href="~/lib/Autocomplete/css/ui-lightness/jquery-ui-1.8.17.custom.css" rel="stylesheet" />
<script src="~/lib/Autocomplete/js/jquery-ui-1.8.17.custom.min.js"></script>
<script type="text/javascript">
    $(function () {
        $("#divKeyWords a").click(function () {
            $("#txtSearch").val($(this).html());
            $("#btnSearch").click();
        });
        getKeyWordsList("txtSearch");
    });
    //自动加载搜索列表
    function getKeyWordsList(txt) {
        if (txt == undefined || txt == "")
            return;
        $("#" + txt).autocomplete({
            source: "/LastSearch/GetKeyWordsList",
            minLength: 1
        });
    }
</script>

1.3数组

array(data = NA, dim = length(data), dimnames = list(维名称列表))

  

a<-(1:60)
dim1<-c("R1","R2","R3","R4")
dim2<-c("C1","C2","C3","C4","C5")
dim3<-c("T1","T2","T3")
a<-array(a,c(4,5,3),dimnames=list(dim1,dim2,dim3))
>a
, , T1

   C1 C2 C3 C4 C5
R1  1  5  9 13 17
R2  2  6 10 14 18
R3  3  7 11 15 19
R4  4  8 12 16 20

, , T2

   C1 C2 C3 C4 C5
R1 21 25 29 33 37
R2 22 26 30 34 38
R3 23 27 31 35 39
R4 24 28 32 36 40

, , T3

   C1 C2 C3 C4 C5
R1 41 45 49 53 57
R2 42 46 50 54 58
R3 43 47 51 55 59
R4 44 48 52 56 60

> a[1:3,c(1,3),]#所有表格1~3行,1、3列的元素
, , T1

   C1 C3
R1  1  9
R2  2 10
R3  3 11

, , T2

   C1 C3
R1 21 29
R2 22 30
R3 23 31

, , T3

   C1 C3
R1 41 49
R2 42 50
R3 43 51

  

10.5.3 标题和内容还支持搜索并高亮展示

以10.4遇,只支持在内容被对重点词进行搜寻,而实际,我们可能既设支持于题目中找找,也只要以情节遭寻找。

这边引入了BooleanQuery,我们的查询条件吧加加了一个titleQuery。

追寻方法中,如下代码有改:

 PhraseQuery query = new PhraseQuery();//查询条件
            PhraseQuery titleQuery = new PhraseQuery();//标题查询条件
            List<string> lstkw = LuceneHelper.PanGuSplitWord(kw);//对用户输入的搜索条件进行拆分。

            foreach (string word in lstkw)            {
                query.Add(new Term("Content", word));//contains("Content",word)
                titleQuery.Add(new Term("Title", word));
            }
            query.SetSlop(100);//两个词的距离大于100(经验值)就不放入搜索结果,因为距离太远相关度就不高了

            BooleanQuery bq = new BooleanQuery();
            //Occur.Should 表示 Or , Must 表示 and 运算
            bq.Add(query, BooleanClause.Occur.SHOULD);
            bq.Add(titleQuery, BooleanClause.Occur.SHOULD);

            TopScoreDocCollector collector = TopScoreDocCollector.create(1000, true);//盛放查询结果的容器
            searcher.Search(bq, null, collector);//使用query这个查询条件进行搜索,搜索结果放入collector

1.4dataframe

10.5.4 与查询、或查询、分页

面前我们当摸的时候,其实采取的且是与查询,也就是说,我输入“诸葛亮周瑜”,则只见面招来出,既在诸葛亮,又存在周瑜的笔录。那么有时候,我们是怀念查询存在诸葛亮或者周瑜的笔录的,这也就算是所谓的要么询问。

自以界面上加一个复选框“或询问”,来叫用户决定以何种方式进行询问。

有关分页,这里以MvcPager,关于MvcPager的以方法要参见4.6.3。

View完整代码预览:

图片 8图片 9

@{
    ViewBag.Title = "Index";
}
@model PagedList<SearchDemo.Models.SearchResult>
@using Webdiyer.WebControls.Mvc;
@using SearchDemo.Models;
<style type="text/css">
.search-text2{ display:block; width:528px; height:26px; line-height:26px; float:left; margin:3px 5px; border:1px solid gray; outline:none; font-family:'Microsoft Yahei'; font-size:14px;}
.search-btn2{width:102px; height:32px; line-height:32px; cursor:pointer; border:0px; background-color:#d6000f;font-family:'Microsoft Yahei'; font-size:16px;color:#f3f3f3;}
.search-list-con{width:640px; background-color:#fff; overflow:hidden; margin-top:0px; padding-bottom:15px; padding-top:5px;}
.search-list{width:600px; overflow:hidden; margin:15px 20px 0px 20px;}
.search-list dt{font-family:'Microsoft Yahei'; font-size:16px; line-height:20px; margin-bottom:7px; font-weight:normal;}
.search-list dt a{color:#2981a9;}
.search-list dt a em{ font-style:normal; color:#cc0000;}
#divKeyWords {text-align:left;width:520px;padding-left:4px;}
#divKeyWords a {text-decoration:none;}
#divKeyWords a:hover {color:red;}
</style>
<link href="~/lib/Autocomplete/css/ui-lightness/jquery-ui-1.8.17.custom.css" rel="stylesheet" />
@using(@Html.BeginForm(null, null, FormMethod.Get))
{
    @Html.Hidden("hidfIsOr")
    <div>@Html.TextBox("txtSearch", null, new { @class="search-text2"})<input type="submit" value="搜索" name="btnSearch" id="btnSearch"  class="search-btn2"/><input type="checkbox" id="isOr" value="false"/>或查询</div>
    <div id="divKeyWords">热门搜索:@if (ViewBag.KeyWords != null) {
             foreach (string v in ViewBag.KeyWords) { 
              <a href="#">@v</a>
             }
         }</div>
    <div class="search-list-con">
        <dl class="search-list">
            @if (Model != null&& Model.Count > 0)
            {
                foreach (var viewModel in Model)
                {
                <dt><a href="@viewModel.Url" target="_blank">@MvcHtmlString.Create(viewModel.Title)</a>@viewModel.CreateTime</dt>
                <dd>@MvcHtmlString.Create(viewModel.Msg)</dd>
                }
            } 
              @Html.Pager(Model, new PagerOptions
 {
     PageIndexParameterName = "id",
     ShowPageIndexBox = true,
     FirstPageText = "首页",
     PrevPageText = "上一页",
     NextPageText = "下一页",
     LastPageText = "末页",
     PageIndexBoxType = PageIndexBoxType.TextBox,
     PageIndexBoxWrapperFormatString = "请输入页数{0}",
     GoButtonText = "转到"
 })
     <br />
     >>分页 共有 @(Model==null? 0: Model.TotalItemCount) 篇文章 @(Model==null?0:Model.CurrentPageIndex)/@(Model==null?0:Model.TotalPageCount)
        </dl>
    </div>
    <div>@ViewData["ShowInfo"]</div>
}
<script type="text/javascript">
    $(function () {
        $("#divKeyWords a").click(function () {
            $("#txtSearch").val($(this).html());
            $("#btnSearch").click();
        });
        getKeyWordsList("txtSearch");
        $("#isOr").click(function () {
            if ($(this).attr("checked") == "checked") {
                $("#hidfIsOr").val(true);
            }
            else {
                $("#hidfIsOr").val(false);
            }
        });
        if ($("#hidfIsOr").val() == "true") {
            $("input[type='checkbox']").prop("checked", true);
        }
    });
    //自动加载搜索列表
    function getKeyWordsList(txt) {
        if (txt == undefined || txt == "")
            return;
        $("#" + txt).autocomplete({
            source: "/LastSearch/GetKeyWordsList",
            minLength: 1
        });
    }
</script>
<script src="~/lib/Autocomplete/js/jquery-ui-1.8.17.custom.min.js"></script>

View Code

然后,各位看官请复看LastSearch控制器中的主意:

图片 10图片 11

 public class LastSearchController : Controller
    {
        //
        // GET: /LastSearch/

        string indexPath = System.Configuration.ConfigurationManager.AppSettings["lucenedir"];
        private SearchDemoContext db = new SearchDemoContext();

              public ActionResult Index(string txtSearch, bool? hidfIsOr, int id = 1)
        {
            PagedList<SearchResult> list = null;
            if (!string.IsNullOrEmpty(txtSearch))//如果点击的是查询按钮
            {
                //list = Search(txtSearch);
                list = (hidfIsOr == null || hidfIsOr.Value == false) ? OrSearch(txtSearch, id) : AndSearch(txtSearch, id);
            }
            var keyWords = db.SearchTotal.OrderByDescending(a => a.SearchCounts).Select(x => x.KeyWords).Skip(0).Take(6).ToList();
            ViewBag.KeyWords = keyWords;
            return View(list);
        }
        //与查询
        PagedList<SearchResult> AndSearch(String kw, int pageNo, int pageLen = 4)
        {
            FSDirectory directory = FSDirectory.Open(new DirectoryInfo(indexPath), new NoLockFactory());
            IndexReader reader = IndexReader.Open(directory, true);
            IndexSearcher searcher = new IndexSearcher(reader);
            PhraseQuery query = new PhraseQuery();//查询条件
            PhraseQuery titleQuery = new PhraseQuery();//标题查询条件
            List<string> lstkw = LuceneHelper.PanGuSplitWord(kw);//对用户输入的搜索条件进行拆分。

            foreach (string word in lstkw)
            {
                query.Add(new Term("Content", word));//contains("Content",word)
                titleQuery.Add(new Term("Title", word));
            }
            query.SetSlop(100);//两个词的距离大于100(经验值)就不放入搜索结果,因为距离太远相关度就不高了

            BooleanQuery bq = new BooleanQuery();
            //Occur.Should 表示 Or , Must 表示 and 运算
            bq.Add(query, BooleanClause.Occur.SHOULD);
            bq.Add(titleQuery, BooleanClause.Occur.SHOULD);

            TopScoreDocCollector collector = TopScoreDocCollector.create(1000, true);//盛放查询结果的容器
            searcher.Search(bq, null, collector);//使用query这个查询条件进行搜索,搜索结果放入collector

            int recCount=collector.GetTotalHits();//总的结果条数
            ScoreDoc[] docs = collector.TopDocs((pageNo - 1) * pageLen, pageNo*pageLen).scoreDocs;//从查询结果中取出第m条到第n条的数据

            List<SearchResult> list = new List<SearchResult>();
            string msg = string.Empty;
            string title = string.Empty;

            for (int i = 0; i < docs.Length; i++)//遍历查询结果
            {
                int docId = docs[i].doc;//拿到文档的id,因为Document可能非常占内存(思考DataSet和DataReader的区别)
                //所以查询结果中只有id,具体内容需要二次查询
                Document doc = searcher.Doc(docId);//根据id查询内容。放进去的是Document,查出来的还是Document
                SearchResult result = new SearchResult();
                result.Id = Convert.ToInt32(doc.Get("Id"));
                msg = doc.Get("Content");//只有 Field.Store.YES的字段才能用Get查出来
                result.Msg = LuceneHelper.CreateHightLight(kw, msg);//将搜索的关键字高亮显示。
                title = doc.Get("Title");
                foreach (string word in lstkw)
                {
                    title=title.Replace(word,""+word+"");
                }
                //result.Title=LuceneHelper.CreateHightLight(kw, title);
                result.Title = title;
                result.CreateTime = Convert.ToDateTime(doc.Get("CreateTime"));
                result.Url = "/Article/Details?Id=" + result.Id + "&kw=" + kw;
                list.Add(result);
            }
            //先将搜索的词插入到明细表。
            SearchDetail _SearchDetail = new SearchDetail { Id = Guid.NewGuid(), KeyWords = kw, SearchDateTime = DateTime.Now };
            db.SearchDetail.Add(_SearchDetail);
            int r = db.SaveChanges();

            PagedList<SearchResult> lst = new PagedList<SearchResult>(list, pageNo, pageLen, recCount);
            lst.TotalItemCount = recCount;
            lst.CurrentPageIndex = pageNo;

            return lst;
        }
        //或查询
        PagedList<SearchResult> OrSearch(String kw, int pageNo, int pageLen = 4)
        {
            FSDirectory directory = FSDirectory.Open(new DirectoryInfo(indexPath), new NoLockFactory());
            IndexReader reader = IndexReader.Open(directory, true);
            IndexSearcher searcher = new IndexSearcher(reader);
            List<PhraseQuery> lstQuery = new List<PhraseQuery>();
            List<string> lstkw = LuceneHelper.PanGuSplitWord(kw);//对用户输入的搜索条件进行拆分。

            foreach (string word in lstkw)
            {
                PhraseQuery query = new PhraseQuery();//查询条件
                query.SetSlop(100);//两个词的距离大于100(经验值)就不放入搜索结果,因为距离太远相关度就不高了
                query.Add(new Term("Content", word));//contains("Content",word)

                PhraseQuery titleQuery = new PhraseQuery();//查询条件
                titleQuery.Add(new Term("Title", word));

                lstQuery.Add(query);
                lstQuery.Add(titleQuery);
            }

            BooleanQuery bq = new BooleanQuery();
            foreach (var v in lstQuery)
            {
                //Occur.Should 表示 Or , Must 表示 and 运算
                bq.Add(v, BooleanClause.Occur.SHOULD);
            }
            TopScoreDocCollector collector = TopScoreDocCollector.create(1000, true);//盛放查询结果的容器
            searcher.Search(bq, null, collector);//使用query这个查询条件进行搜索,搜索结果放入collector

            int recCount = collector.GetTotalHits();//总的结果条数
            ScoreDoc[] docs = collector.TopDocs((pageNo - 1) * pageLen, pageNo * pageLen).scoreDocs;//从查询结果中取出第m条到第n条的数据

            List<SearchResult> list = new List<SearchResult>();
            string msg = string.Empty;
            string title = string.Empty;

            for (int i = 0; i < docs.Length; i++)//遍历查询结果
            {
                int docId = docs[i].doc;//拿到文档的id,因为Document可能非常占内存(思考DataSet和DataReader的区别)
                //所以查询结果中只有id,具体内容需要二次查询
                Document doc = searcher.Doc(docId);//根据id查询内容。放进去的是Document,查出来的还是Document
                SearchResult result = new SearchResult();
                result.Id = Convert.ToInt32(doc.Get("Id"));
                msg = doc.Get("Content");//只有 Field.Store.YES的字段才能用Get查出来
                result.Msg = LuceneHelper.CreateHightLight(kw, msg);//将搜索的关键字高亮显示。
                title = doc.Get("Title");
                foreach (string word in lstkw)
                {
                    title = title.Replace(word, "" + word + "");
                }
                //result.Title=LuceneHelper.CreateHightLight(kw, title);
                result.Title = title;
                result.CreateTime = Convert.ToDateTime(doc.Get("CreateTime"));
                result.Url = "/Article/Details?Id=" + result.Id + "&kw=" + kw;
                list.Add(result);
            }
            //先将搜索的词插入到明细表。
            SearchDetail _SearchDetail = new SearchDetail { Id = Guid.NewGuid(), KeyWords = kw, SearchDateTime = DateTime.Now };
            db.SearchDetail.Add(_SearchDetail);
            int r = db.SaveChanges();

            PagedList<SearchResult> lst = new PagedList<SearchResult>(list, pageNo, pageLen, recCount);
            lst.TotalItemCount = recCount;
            lst.CurrentPageIndex = pageNo;

            return lst;
        }

        /// <summary>
        /// 获取客户列表 模糊查询
        /// </summary>
        /// <param name="term"></param>
        /// <returns></returns>
        public string GetKeyWordsList(string term)
        {
            if (string.IsNullOrWhiteSpace(term))
                return null;

            var list = new KeyWordsTotalService().GetSearchMsg(term);
            //序列化对象
            //尽量不要用JavaScriptSerializer,为什么?性能差,完全可用Newtonsoft.Json来代替
            //System.Web.Script.Serialization.JavaScriptSerializer js = new System.Web.Script.Serialization.JavaScriptSerializer();
            //return js.Serialize(list.ToArray());
            return JsonConvert.SerializeObject(list.ToArray());
        }

View Code

迄今,站内搜索的基本功能均曾成功。

创建

data.frame(域名1=向量名1,域名2=向量名2,...)
names(数据框名)

  

> ForeDataFrm<-data.frame(FX=X,FY=Y,Fmonth=month,Fday=day,Ftemp=temp,FRH=RH,Fwind=wind,Frain=rain,Farea=area)
> ForeDataFrm
  FX FY Fmonth Fday Ftemp FRH Fwind Frain Farea
1  1  2    aug  fri  14.7  66   2.7     0     0
2  1  2    aug  fri  18.5  73   8.5     0     0
3  1  2    aug  fri  25.9  41   3.6     0     0
> names(ForeDataFrm)
[1] "FX"     "FY"     "Fmonth" "Fday"   "Ftemp"  "FRH"    "Fwind" 
[8] "Frain"  "Farea" 

> str(ForeDataFrm)
'data.frame':   3 obs. of  9 variables:
 $ FX    : num  1 1 1
 $ FY    : num  2 2 2
 $ Fmonth: Factor w/ 1 level "aug": 1 1 1
 $ Fday  : Factor w/ 1 level "fri": 1 1 1
 $ Ftemp : num  14.7 18.5 25.9
 $ FRH   : num  66 73 41
 $ Fwind : num  2.7 8.5 3.6
 $ Frain : num  0 0 0
 $ Farea : num  0 0 0

  若创建数量框时尚未发生数量以及之对诺:

> a<-data.frame(x1=numeric(0),x2=character(0),x3=logical(0))
> str(a)
'data.frame':   0 obs. of  3 variables:
 $ x1: num 
 $ x2: Factor w/ 0 levels: 
 $ x3: logi 
> fix(a)

  访问

1.数据框名$域名
2.数据框名[["域名"]]
3.数据框名[[域编号]]
4.
attach(数据框名)
访问域名函数1
访问域名函数2
...
detach(数据框名)

with(数据框名,{域访问函数1,...})#不可以修改域中的数据

数据框名<-within(数据框名,{域访问函数1,...})

  

> ForeDataFrm
  FX FY Fmonth Fday Ftemp FRH Fwind Frain Farea
1  1  2    aug  fri  14.7  66   2.7     0     0
2  1  2    aug  fri  18.5  73   8.5     0     0
3  1  2    aug  fri  25.9  41   3.6     0     0
> ForeDataFrm$Fwind
[1] 2.7 8.5 3.6
> ForeDataFrm[["Ftemp"]]
[1] 14.7 18.5 25.9
> ForeDataFrm[[5]]
[1] 14.7 18.5 25.9
> ForeDataFrm$Ftemp<-ForeDataFrm$Ftemp*1.8+32#对指定域名进行操作
> attach(ForeDataFrm)
> Ftemp
[1] 58.46 65.30 78.62
> Fwind
[1] 2.7 8.5 3.6
> detach(ForeDataFrm)
> Ftemp #detach以后就不能操作了
Error: object 'Ftemp' not found

 

> with(ForeDataFrm,{
+ print(Ftemp)
+ Ftemp<-(Ftemp-32)/1.8
+ print(Ftemp)
+ print(Fwind)
+ })
[1] 58.46 65.30 78.62
[1] 14.7 18.5 25.9
[1] 2.7 8.5 3.6

  

 

1.5list

 

list(成分名1=对象名1,....)

  创建一个名为d的列表,成分叫吧L1,L2,L3,依次对应a,b,c三单对象

a<-c(1,2,3)
b<-matrix(nrow=5,ncol=2)
b[,1]=seq(from=1,to=10,by=2)
b[,2]=seq(from=10,to=1,by=-2)
c<-array(1:60,c(4,5,3))
d<-list(L1=a,L2=b,L3=c) 
names(d)  
str(d)
is.list(d) 
d$L1 
d[["L2"]]
d[[2]]

> str(d)
List of 3
$ L1: num [1:3] 1 2 3
$ L2: num [1:5, 1:2] 1 3 5 7 9 10 8 6 4 2
$ L3: int [1:4, 1:5, 1:3] 1 2 3 4 5 6 7 8 9 10 …

> d$L1 
[1] 1 2 3
> d[["L2"]]
     [,1] [,2]
[1,]    1   10
[2,]    3    8
[3,]    5    6
[4,]    7    4
[5,]    9    2
> d[[2]]
     [,1] [,2]
[1,]    1   10
[2,]    3    8
[3,]    5    6
[4,]    7    4
[5,]    9    2

  1.6数额对象的竞相转化

1.例外储存类型

typeof()
as.储存类型(数据对象名)
#numeric,integer,double,charactor,logical

2.不同结构类型中的中转

> (a<-c(1:10))
 [1]  1  2  3  4  5  6  7  8  9 10
> (b<-matrix(a,nrow=5,ncol=2,byrow=TRUE))
     [,1] [,2]
[1,]    1    2
[2,]    3    4
[3,]    5    6
[4,]    7    8
[5,]    9   10
> (a<-as.matrix(a))
      [,1]
 [1,]    1
 [2,]    2
 [3,]    3
 [4,]    4
 [5,]    5
 [6,]    6
 [7,]    7
 [8,]    8
 [9,]    9
[10,]   10
> is.matrix(a)
[1] TRUE
> (b<-as.vector(b))
 [1]  1  3  5  7  9  2  4  6  8 10
> is.vector(b)
[1] TRUE

  注意矩阵转成向量是打错误到右的一一

as.matrix()
as.vector()

  

3.向量-因子

as.factor()
factor(向量名,order=TURE/FALSE,level=c(类别值列表))

  

> (a<-c("Poor","Improved","Excellent","Poor")) 
[1] "Poor"      "Improved"  "Excellent" "Poor"     
> (b<-factor(a,order=FALSE,levels=c("Poor","Improved","Excellent")))  
[1] Poor      Improved  Excellent Poor     
Levels: Poor Improved Excellent
> (b<-factor(a,order=TRUE,levels=c("Poor","Improved","Excellent")))
[1] Poor      Improved  Excellent Poor     
Levels: Poor < Improved < Excellent

  因子-向量:R不支持对因子直接抬高水平值,所以要先行用因子转为向量,并加上相应品种的素,再以向量转为因子

as.vector(因子名)

  

#######利用factor函数重新设置类别值
> (a<-c("Poor","Improved","Excellent","Poor")) 
[1] "Poor"      "Improved"  "Excellent" "Poor"     
> (b<-factor(a,levels=c("Poor","Improved","Excellent")))  
[1] Poor      Improved  Excellent Poor     
Levels: Poor Improved Excellent
> (b<-factor(a,levels=c("Poor","Improved","Excellent"),labels=c("C","B","A")))
[1] C B A C
Levels: C B A

###############借助类型转换增加因子的水平
> (a<-c("A","C","B","C")) 
[1] "A" "C" "B" "C"
> (b<-as.factor(a))
[1] A C B C
Levels: A B C
> b[5]<-"D"
Warning message:
In `[<-.factor`(`*tmp*`, 5, value = "D") :
  invalid factor level, NA generated

> c<-as.vector(b)
> typeof(c)
[1] "character"
> c[5]<-"D"
> (b<-as.factor(c))
[1] A C B C D
Levels: A B C D

  

2.导入数据

1.念文件数据

#到向量中
> Forest<-scan(file="ForestData.txt",what=double(),skip=1)   #无法成功执行,要求数据类型一样
Error in scan(file = "ForestData.txt", what = double(), skip = 1) : 
  scan() expected 'a real', got 'aug'

#到数据框中
Forest<-read.table(file="ForestData.txt",header=TRUE)
str(Forest)
names(Forest)

Forest<-read.table(file="ForestData.txt",header=TRUE,stringsAsFactors=FALSE)
#取消对字符串类型按字母顺序从小到大排序

Forest<-read.table(file="ForestData.txt",header=TRUE,
 colClass=c("integer","integer","character","character","double","integer","double","double","double"))
#指定储存类型

  

2.表数据导入

########################################读取SPSS数据
library(foreign)
Forest<-read.spss(file="ForestData.sav",use.value.labels = TRUE, to.data.frame = TRUE)
str(Forest)


########################读取Excel数据
install.packages("xlsx")
library("xlsx")
Forest<-read.xlsx("ForestData.xlsx",1,header=TRUE,as.data.frame=TRUE)
str(Forest)
levels(Forest$month)
Forest$month<-factor(Forest$month,order=TRUE,levels=c("jan","feb","mar","apr","may","jun","jul","aug","sep","oct","nov","dec"))
levels(Forest$month)

#######################读取数据库数据
install.packages("RODBC")
library("RODBC")
MyConn<-odbcConnectAccess2007("ForestData.accdb",uid="",pwd="")
Forest<-sqlFetch(MyConn,"Table1")
close(MyConn)
str(Forest)

  

3.数目统一和排列

3.1.数码统一

## Default S3 method:
merge(x, y, ...)

## S3 method for class 'data.frame'
merge(x, y, by = intersect(names(x), names(y)),
      by.x = by, by.y = by, all = FALSE, all.x = all, all.y = all,
      sort = TRUE, suffixes = c(".x",".y"),
      incomparables = NULL, ...)


authors <- data.frame(
    surname = I(c("Tukey", "Venables", "Tierney", "Ripley", "McNeil")),
    nationality = c("US", "Australia", "US", "UK", "Australia"),
    deceased = c("yes", rep("no", 4)))
books <- data.frame(
    name = I(c("Tukey", "Venables", "Tierney",
             "Ripley", "Ripley", "McNeil", "R Core")),
    title = c("Exploratory Data Analysis",
              "Modern Applied Statistics ...",
              "LISP-STAT",
              "Spatial Statistics", "Stochastic Simulation",
              "Interactive Data Analysis",
              "An Introduction to R"),
    other.author = c(NA, "Ripley", NA, NA, NA, NA,
                     "Venables & Smith"))

(m1 <- merge(authors, books, by.x = "surname", by.y = "name"))
(m2 <- merge(books, authors, by.x = "name", by.y = "surname"))

> (m1 <- merge(authors, books, by.x = "surname", by.y = "name"))
   surname nationality deceased                         title
1   McNeil   Australia       no     Interactive Data Analysis
2   Ripley          UK       no            Spatial Statistics
3   Ripley          UK       no         Stochastic Simulation
4  Tierney          US       no                     LISP-STAT
5    Tukey          US      yes     Exploratory Data Analysis
6 Venables   Australia       no Modern Applied Statistics ...
  other.author
1         <NA>
2         <NA>
3         <NA>
4         <NA>
5         <NA>
6       Ripley
> (m2 <- merge(books, authors, by.x = "name", by.y = "surname"))
      name                         title other.author nationality
1   McNeil     Interactive Data Analysis         <NA>   Australia
2   Ripley            Spatial Statistics         <NA>          UK
3   Ripley         Stochastic Simulation         <NA>          UK
4  Tierney                     LISP-STAT         <NA>          US
5    Tukey     Exploratory Data Analysis         <NA>          US
6 Venables Modern Applied Statistics ...       Ripley   Australia
  deceased
1       no
2       no
3       no
4       no
5      yes
6       no

  

3.2.数额排列

order(..., na.last = TRUE, decreasing = FALSE,
      method = c("auto", "shell", "radix"))

> ReportCard<-read.table(file="ReportCard1.txt",header=TRUE)
> Ord<-order(ReportCard$math,na.last=TRUE,decreasing=TRUE) #按math从高到低排列
> Ord#位置向量
 [1] 48 60 59 15 27 23 36 30 49 42  6 28  7 41 58 32 54 45 39 44
[21] 52 12 40 38 10  4 29 26 56 33 43 37 31 16  3 11  9 55 50 13
[41] 47 51 53 22 24 57 18  8 19 21 34 46  2 14 20 35  5 25 17  1
> a<-ReportCard[Ord,]
> fix(a)

 图片 12 

4.缺少失数据

is.na()
is.nan()
complete.cases(matrix/dataframe)#注意判断每个观测样本是否有缺失值

 

> a<-ReportCard[Ord,]
> a<-is.na(ReportCard$math)
> ReportCard[a,]
     xh sex poli chi math
1 92103   2   NA  NA   NA
> a<-complete.cases(ReportCard)#判断完整观测
> ReportCard[!a,]#显示有缺失成绩的学生信息
     xh sex poli chi math
1 92103   2   NA  NA   NA
3 92142   2  NaN  70   59

  生成缺失值报告

md.pattern()

  

> library("mice")

> md.pattern(ReportCard)

xh sex chi math poli
58 1 1 1 1 1 0  #生58只学生的成就数据是圆的,有0个短失值的变量
1 1 1 1 1 0 1  #出1称作学童以poli这一个变量上抱了确实值
1 1 1 0 0 0 3 #产生同誉为学子在三只变量上沾了短失值
   0 0 1 1 2 4 #每当列变量上取缺失值的观测样本数

 

  

 缺失值处理

#粗暴的删除
na.omit()

  

> ReportCard1<-read.table(file="ReportCard1.txt",header=TRUE)
> ReportCard2<-read.table(file="ReportCard2.txt",header=TRUE)
> ReportCard<-merge(ReportCard1,ReportCard2,by="xh")
> attach(ReportCard)
> SumScore<-poli+chi+math+fore+phy+che+geo+his
> detach(ReportCard)
> avScore<-SumScore/8
> ReportCard$sumScore<-SumScore
> ReportCard$avScore<-avScore
> sum(is.na(ReportCard$sumScore))
[1] 2
> mean(complete.cases(ReportCard))
[1] 0.9666667

  

5.变量盘算和重编码

5.1变量计算

+ – * / ^ 整除:%/% 求余:%%

函数

图片 13

图片 14

图片 15

图片 16

图片 17

5.2于定义函数

MyFun<-function(dataname1=x1,dataname2=x2,key=c){
 result<-merge(dataname1,dataname2,by=key)
 return(result)
}
#####调用用户自定义函数
MyData<-MyFun(dataname1=ReportCard1,dataname2=ReportCard2,key="xh")
MyData<-MyFun(ReportCard1,ReportCard2,"xh")
#调试函数

> debug(MyFun)
>
MyData<-MyFun(dataname1=ReportCard1,dataname2=ReportCard2,key=”xh”)
debugging in: MyFun(dataname1 = ReportCard1, dataname2 = ReportCard2,
key = “xh”)
debug at #1: {
result <- merge(dataname1, dataname2, by = key)
return(result)
}
Browse[2]> n
debug at #2: result <- merge(dataname1, dataname2, by = key)
Browse[2]> n
debug at #3: return(result)
Browse[2]> n
exiting from: MyFun(dataname1 = ReportCard1, dataname2 = ReportCard2,
key = “xh”)

  

5.重编码

1.分组

ReportCard<-within(ReportCard,{
 avScore[avScore>=90]<-"A"
 avScore[avScore>=80 & avScore<90]<-"B"
 avScore[avScore>=70 & avScore<80]<-"C"
 avScore[avScore>=60 & avScore<70]<-"D"
 avScore[avScore<60]<-"E"
 })
flag<-ReportCard$avScore %in% c("A","B","C","D","E")
ReportCard$avScore[!flag]<-NA

  图片 18

2.重定义类别值

> unique(ReportCard$sex)   #获得性别的取值,检查性别取值范围是否正确
[1] 2 1
> ReportCard$sex<-factor(ReportCard$sex,levels=c(1,2),labels=c("M","F"))
> str(ReportCard$sex)
 Factor w/ 2 levels "M","F": 2 1 2 2 1 2 2 1 1 2 ...

  

6.数目筛选

1.仍规范筛选

#提取男生(性别为M)的数据
MaleScore1<-subset(ReportCard,ReportCard$sex=="M")
Sel1<-ReportCard$sex=="M"
MaleScore1<-ReportCard[Sel1,]

#提取男生(性别为M)且平均成绩不及格(小于60分)的学生数据
MaleScore2<-subset(ReportCard,ReportCard$sex=="M" & ReportCard$avScore=="E")

  

2.随机筛选

sample(x, size, replace = FALSE, prob = NULL)#对数据按指定方式筛选观测样本,prob表示向量中各元素有不同的入样概率
set.seed()#希望抽样结果可以重复出现

  

set.seed(10000)
bh<-sample(1:60,size=30,replace=FALSE)
MySample<-ReportCard[bh,]
#以60个学生的行编号作为抽样依据,通过对行编号的随机抽样生成位置向量,实现对成绩的随机抽样

  

7.数额保存

write.table(ReportCard,file="ReportCard.txt",sep=" ",quote=FALSE,append=FALSE,na="NA",row.names=FALSE,col.names=TRUE)

  

8.操流程

图片 19

MyTable<-function(mytable){
 rows<-dim(mytable)[1]
 cols<-dim(mytable)[2]
 DataTable<-NULL
 for(i in 1:rows){
  for(j in 1:mytable$freq[i]){
   RowData<-mytable[i,c(1:(cols-1))]
   DataTable<-rbind(DataTable,RowData)
  }
 }
 row.names(DataTable)<-c(1:dim(DataTable)[1])
 return(DataTable)
}

Grade<-rep(c("B","C","D","E"),times=2)
Sex<-rep(c("M","F"),each=4)
Freq<-c(2,11,12,5,2,13,10,3)
Table<-data.frame(sex=Sex,grade=Grade,freq=Freq)
MyData<-MyTable(Table)

> Table
  sex grade freq
1   M     B    2
2   M     C   11
3   M     D   12
4   M     E    5
5   F     B    2
6   F     C   13
7   F     D   10
8   F     E    3