新南威尔士大学COMP2521Assignment1课业解析
新南威尔士大学COMP2521Assignment1课业解析
题意:
实现一个C语言的抽象数据类型textbuffer的各种操作
解析:
包含下列操作:TB newTB (char *text);开辟新的空间用给定的text内容初始化;void releaseTB (TB tb);释放内存,之后不可访问;char *dumpTB (TB tb, bool showLineNumbers);按格式按行显示textbuffer存贮的内容,如果showLineNumbers是True,那么显示每行的行号,false则不显示,如下:
text原内容:
hello world
amazing
dumpTB(tb,True)结果: 1.hello world\n2.amazing\n;dumpTB(tb,Flase)结果: hello world\namazing\n int linesTB (TB tb);返回行数;void addPrefixTB (TB tb, int from, int to, char *prefix);给指定行数添加前缀;如addPrefixTB (tb, 1, 3, "goodnight ")给tb的1至3行添加前缀 goodnight;类似地void deleteTB (TB tb, int from, int to);删除from至to行的内容;void mergeTB (TB tb1, int pos, TB tb2);在tb1的pos行插入tb2的内容,之后释放tb2;void pasteTB (TB tb1, int pos, TB tb2);和merge类似但是保留tb2;TB cutTB (TB tb, int from, int to);剪切from至to行的内容;Match searchTB (TB tb, char *search);在tb中搜索search的内容,返回起始字母的行号列号。
涉及知识点:
文本处理
更多可加薇 欣 讨论
薇 欣 号:lpls666
COMP2521 Assignment 1 Textbuffer Submission Specification Jump to FAQ Your task is to implement an abstract textbuffer data type that meets the given interface. You will submit the C code implementing the textbuffer ADT (textbuffer.c). This page describes the interface of the textbuffer ADT that you are to implement. For your implementation, download textbuffer.cbelow and implement the typestruct textbufferas well as all of the functions whose prototypes are given in the header filetextbuffer.h. All your code should go intextbuffer.c, which you have to submit. Changelog 1st October Proofreading Line numbers now always start at 1 ChangedcharIndexinstruct _matchNodetocolumnNumber Changedint showLineNumbersindumpTBtobool showLineNumbers Added a stub test filetestTextbuffer.cfor students to write tests Improved formatting/style oftextbuffer.candtextbuffer.h Added some clarifications to the spec 2nd October Added clarifications forsearchTBto the FAQ Added a line totestTextbuffer.cthat callslinesTB Updated the comment foraddPrefixTBintextbuffer.cto be consistent withtextbuffer.h 3rd October Added a clarification forpasteTBto the FAQ Added a clarification about bonus marks Added a clarification forformRichText- "Note that the#character must be the first character in a lineand there must be more characters on that linefor it to be treated as a special character - otherwise, it does nothing." Moved testing for memory leaks from Style Marks to Autotesting Marks 4th October Added a question to the FAQ undernewTBon the input text beingNULL. Fixed the answer to one of thenewTBquestions: "Unlesstextis the empty string, it will always have a newline at the end." 8th October Added details to the FAQ on howdiffTBwill be tested Added an example fordiffTBto the FAQ 9th October Added details to "Late Submission" section Bonus marks can now only make up for lost marks in the labs and the second assignment - not the midterm exam. Sorry! 11th October Removed confusing question aboutabortfrom the FAQ 13th October Submission instructionsare now available 14th October Added a clarification fordumpTBto the FAQ - the returned string should always be allocated such that that the user can free it. Added a clarification forsearchTBto the FAQ about returning an empty list 15th October Removed the-Oflag from the compilation line and replaced it with-g, which supports debugging. (-Ois just an optimisation flag, which is unnecessary for this assignment.) Clarified and removed vagueness from the 'Compactness' requirement fordiffTB. There is now no 'model solution'. Instead, you'll pass each test if the number of commands in your edit solution is smaller than some threshold, determined by the sizes of the optimal and brute-force solutions (see the section fordiffTBfor details). Added a link to download all dryrun files Added clarifications on undoingcutTBandpasteTBto the FAQ. 19th October Added some clarifications/examples based on questions asked on WebCMS. Marks
The assignment is worth 10 marks. The mark breakdown is as follows: ComponentMarkAutotesting of functionality8 (+2 bonus)Subjective evaluation of style2
Due to the bonus challenges, you could get up to 12 marks for the assignment. Any extra marks obtained during this assignment (in excess of the usual 10 marks) can be used to make up for lost marks in the labs and the second assignment. Automarking - 8 (+2) Marks We will run a number of tests against your textbuffer implementation. These will be much more comprehensive than the tests we run during submission. You get marks for each test you pass. We will also test your program for memory leaks (memory you have allocated and have responsibility to free but neverfree'd) and memory errors. Your program will be tested for memory leaks/errors viavalgrind. Style - 2 Marks Style marks will include comments, indentation, variable names, etc., and will also include marks for choosing an appropriate representation for your ADT and for the efficiency of the functions you implement. For example, you will lose marks if your implementation of a function has a time complexity ofO(n^2)when there is a solution with a time complexity ofO(n)orO(n * log n). Submission Deadline:9am, Saturday 26 October 2019. You need to submit one file:textbuffer.c You can submit from the assignment page on WebCMS via the give interface or by running the command below: give cs2521 assign1 textbuffer.c The submission system runs a few simple dryrun tests. All files used by the dryrun are availablehere(clickhereto download the whole lot). After the deadline, all functions will be more thoroughly tested by the automarking system. You can submit multiple times - only your last submission will count. Late Submission A late penalty of 15% per day will be applied. The latest you can submit the assignment is 9am, 31 October 2019, of course with late penalty. Files 1. 2.textbuffer.h 1. 2.textbuffer.c 1. 2.testTextbuffer.c Note:When we test your assignment, it with be compiled withgccand the following flags: gcc -Wall -Werror -std=c11 -g -lm -o testTextbuffer testTextbuffer.c textbuffer.c ADT Specification The following is a description of the components of the interface. As marks are awarded by an automated marking program, you must follow this specification precisely. Otherwise, you risk getting few or no marks! You mustNOTmodify thetextbuffer.hfile. The ADT type We represent the ADT by way of a handle of typeTB. The handle type is declared in the header file, but you will have to provide an implementation of the handle representation - i.e. ofstruct textbuffer- as part of your implementation: typedef struct textbuffer *TB; Refer to the lecture about ADTs for examples of this construction. Required properties of the implementation A textbuffer is an ordered collection of strings, where each string represents one line of a text file.Your implementation must keep the lines of a textbuffer in a linked data structure(such as a linked list or a variant of that). Each line must be represented as a (dynamically allocated) string. Adding, deleting, or moving lines requires manipulation of the linked structure. Such a data structure may, for example, be used as part of a text editor. Constructor and destructor newTB TB newTB (char *text); newTBallocates a new textbuffer and initialises its contents with the text in the given string. Each fragment of the string that ends with a newline character ('\n') indicates a separate line in the textbuffer. releaseTB void releaseTB (TB tb); releaseTBfrees the memory occupied by the given textbuffer. It is an error to access a textbuffer after freeing it. Query functions The following functions do not alter their textbuffer argument. dumpTB char *dumpTB (TB tb, bool showLineNumbers); dumpTBallocates and returns a string containing the text in the given textbuffer. The returned string should contain newline characters ('\n') to indicate the end of each line in the textbuffer. It is the caller's responsibility to free the memory occupied by the returned string. If there are no lines in the textbuffer, return an empty string (the string should still be allocated). If showLineNumbersistrue, prepend a line number (along with a dot and space) to each line of the output. For example, ifdumpTBwas called on a textbuffer containing the lines"hello world"and"amazing", and showLineNumberswastrue, it should return"1. hello world\n2. amazing\n". IfshowLineNumberswasfalse, it should instead return"hello world\namazing\n". linesTB int linesTB (TB tb); linesTBreturns the number of lines in the given textbuffer. Textbuffer editing For all editing functions, if any of the arguments indicating a line number is out of range (i.e., smaller than 1 or bigger than the number of lines in the textbuffer), the function must print a suitable error message and terminate the program with the standard functionabort(). The first line of a textbuffer is at position/index 1. addPrefixTB void addPrefixTB (TB tb, int from, int to, char *prefix); addPrefixTBadds the supplied prefix to all lines betweenfromandto(inclusive). Iftois less thanfrom, abort. For example, consider callingaddPrefixTB (tb, 1, 3, "goodnight "): + ---------------------------- + + ------------------------------------ + | room | | goodnight room | | moon | ---> | goodnight moon | | cow jumping over the moon | | goodnight cow jumping over the moon | | light | | light | + ---------------------------- + + ------------------------------------ + deleteTB void deleteTB (TB tb, int from, int to); deleteTBdeletes the lines betweenfromandto(inclusive) from the textbuffertb. It should free the memory of the deleted lines. Iftois less thanfrom, abort. Combining textbuffers For all combining functions, if any of the arguments indicating a line number is out of range, the function must print a suitable error message and terminate the program with the standard functionabort(). Note that for these functions, if the number of lines intb1isn, thenn + 1is a valid argument forpos(the lines intb2are added to the end oftb1). mergeTB void mergeTB (TB tb1, int pos, TB tb2); mergeTBmergestb2intotb1at linepos. Afterwards, what was at line 1 oftb2will now be at lineposoftb1. Lineposof tb1will be moved to linepos + linesTB (tb2), after the merged-in lines fromtb2. After this operation,tb2cannot be used anymore (as if we had usedreleaseTBon it). pasteTB void pasteTB (TB tb1, int pos, TB tb2); pasteTBcopies (i.e., pastes) all lines fromtb2intotb1at linepos. It is likemergeTB, buttb2remains unmodified and is still usable independent oftb1. Extracting textbuffers For all extracting functions, if any of the arguments indicating a line number is out of range, the function must print a suitable error message and terminate the program with the standard functionabort(). The textbuffers returned by the extracting functions are as if they were newly created withnewTB(). cutTB TB cutTB (TB tb, int from, int to); cutTBcuts the lines betweenfromandto(inclusive) out of the textbuffertbinto a new textbuffer, which is then returned. If tois less thanfrom, returnNULL. Searching textbuffers Match searchTB (TB tb, char *search); searchTBreturns a linked list of all non-overlapping matches intbof a certain string. The search iscase sensitiveand the textbuffertbmust remain unmodified. The matches must be returned in order of their appearance in the textbuffer. It is the caller's responsibility to free the returned list. Consider callingsearchTB (tb, "love")on the following TB: 1 Hello World My 2 name is jarred lovegood 3 and i love carley ray jepson This should return a list: +====================+ +====================+ | lineNumber: 2 | | lineNumber: 3 | | columnNumber: 16 | | columnNumber: 7 | | next: ----------------->| next: -----------------> NULL +====================+ +====================+ Note that the line number and column number are both1-indexed(i.e., start at 1). The column number refers to a position within the line where there is a match. Note thatMatchis a pointer to the first node in the list. If there are no matches, then returnNULL. Rich text formRichText void formRichText (TB tb); formRichTextsearches every line oftband performs the following substitutions:
StringReplacementExample*some string*<b>some string</b>*hello* -> <b>hello</b>_some string_<i>some string</i>_hello_ -> <i>hello</i>
#some string ... <h1>some string ...</h1> #hello -> <h1>hello</h1> The matching is simplistic in that you would begin scanning at the first special character and continue to consume characters (ignoring any further special characters) until a matching special character. If there is no matching special character, nothing is done and the next special character (if there is one) is processed. Note that the#character must be the first character in a line and there must be more characters on that line for it to be treated as a special character - otherwise, it does nothing. Furthermore, it matches until the end of the line and not until a matching#. See example below.
ExampleResult*some string*some string*some string*lol*<b>some string</b>lol** *<b> </b>
*some_string*again_<b>some_string</b>again_*some* _string_<b>some</b> <i>string</i>some *string_again_some *string<i>again</i>
some#string*once_again* some#string<b>once_again</b> #string_stuff_ <h1>string_stuff_</h1> # # ### <h1>##</h1> Example Result In the case of nested special characters, for example: *some_string_* #some _string_ Take the outermost element and ignore any nesting. Example Result *some_string_* <b>some_string_</b> #some _string_ <h1>some _string_</h1> If there are no characters between a pair of consecutive special characters, for example,hello ** world, ignore it and continue to the next pair of special characters (if there is one). For example: hello ** world --> hello ** world hello **world* --> hello *<b>world</b> hello **world** --> hello *<b>world</b>* **hello***world** --> *<b>hello</b>*<b>world</b>* ***hello* --> **<b>hello</b> Note that in the last case, the first*does nothing because there are no characters between it and the next*. In that case the first *is ignored and the next one is processed as normal. Assignment 1 Bonus Challenges diffTB (1 bonus mark) char *diffTB (TB tb1, TB tb2); Given two text files, we sometimes want to know what changes are made from one file to another file. The functiondiffTBworks out which lines of texts are added or deleted fromtb1to gettb2. The string returned from the function is an edit solution consisting of a series of add and delete commands. Applying such commands ontb1in sequence should result intb2. An edit solution should have one command per line to either add or delete a line of text at a specific line number. An example is given below. The first command adds a line of text 'add this line please' at line 2 of the current textbuffer (counting from 1). The existing line 2 is moved to line 3, and so on. The second command deletes line 3 of the textbuffer. The last command adds the specified text at line 12 of the textbuffer. +,2,add this line please -,3 +,12,add this line as well please A mark is given if your solution satisfies two criteria given below: Correctness- applying your edit solution ontb1results intb2. Compactness- the size of your edit solution (i.e., number of commands/lines) is smaller than or equal to the average of the sizes of the optimal solution and the brute-force solution (which is "delete all lines oftb1and add all lines oftb2"). This is to avoid brute-force solutions, such as the one just described. undoTB and redoTB (1 bonus mark) void undoTB (TB tb); undoTBallows the user to reverse up to 10 of the most-recently called operations ontb. Applicable operations are:deleteTB, mergeTB,pasteTB, andcutTB. Each timeundoTBis called, one operation is reversed ontb. When the maximum number of allowable undo operations is reached, further calls toundoTBshould do nothing (until one of the applicable operations is performed again orredoTBis called). void redoTB (TB tb); The functionredoTBallows the user to redo operations that have been reversed byundoTB. Similar toundoTB, this function should redo one operation ontbper function call. However, when a new operation is called ontb, any reversed operations cannot be executed again withredoTB. Note:When testing yourundoTBandredoTBfunctions, we will not call inapplicable operations such asaddPrefixTBand formRichText, so you do not have to worry about undoing such operations. COMP2521 Assignment 1 FAQ General questions Can I modifytextbuffer.h? No. Can I add my own structs and functions totextbuffer.c? Yes! Make sure your helper functions are declaredstatic, and that you document what the functions and structures you add are for. Can I use functions from<string.h>? Yes. It's much, much harder if you don't. Will I need to check my code withvalgrind? We'll certainly be checking your submission withvalgrindfor memory leaks. CanTBever beNULL? It can be in the case something goes wrong but in any case you have aTBwhich isNULLthe correct course of action would be to print a suitable error message andabort(). Can I use themath.hlibrary? Yes. Will the time complexity of bonus functions affect the bonus marks? No. newTB How doesnewTBwork? If the input text is, for example,"Hi there,\nhow\nare\nthings\n", the textbuffer should contain the following lines: { "Hi there,", "how", "are", "things" }. You will have to process the input text, extract all the substrings separated by newlines, and copy them into your textbuffer structure. Should I leave the'\n'characters in? Depending on your approach to splitting text, they may already be gone. The only other place you need the'\n' characters is indumpTB, so you could probably get away without storing them. But it is up to you. Is it safe to assume that the input text will always have a newline at the end? Unlesstextis the empty string, it will always have a newline at the end. What should happen with multiple consecutive newlines? Every newline marks a new line in the textbuffer, so a newline that immediately follows another newline (or a newline at the beginning of the input text) would represent an empty line. You need to track empty lines. Can I assume a maximum length for lines? No. Your program should be able to dynamically allocate any memory needed for your strings depending on the input text. What if the input text is a empty string? Create an empty TB. What if the input text consists of just a single newline character? Create a TB with one empty line. What if the input text isNULL? We won't be testing this (asNULLis not a valid string), but in this case a sensible thing to do would be to print a suitable error message andabort(). releaseTB How can I testreleaseTB? You can't. You can't write a black-box test for a destructor. When youfree()memory, what you're saying is that you no longer need the block of memory you had a pointer to; it should be irrelevant to you whether that memory's value changes or becomes invalid in some way, becauseyou are absolutely forbidden from accessing the memory once free'd. Use-after-free is an illegal and undefined operation. A good test that yourreleaseTBworked is that your program is still running after you do so. Do note though thatvalgrindmay be useful to help diagnose memory leaks which can indirectly signal a error with yourreleaseTB. dumpTB My textbuffer has no lines; what shoulddumpTBreturn? It should return an empty string, regardless of whethershowLineNumbersistrueorfalse. Note that this string should still be allocated so that the user can free it. addPrefixTB Can the prefix string have newlines in it? No. We will not test these cases. Can the prefix string be the empty string? Yes. In this case, do nothing. Can the prefix string beNULL? No. In this case,abort(). mergeTB What should happen if ImergeTB (tb1, 1, tb1)? Attempts to merge a textbuffer with itself should be ignored. Should I callreleaseTBas well? No!This will probably destroy both the source and destination textbuffers. However, you've moved the contents of the source textbuffer, so you can justfree()as you would inreleaseTB. You must not subsequently dereference it; that's a use-after-free and(say it with me, folks!)use-after-free is illegal. Can I concatenate text buffers withmergeTB? The correct behaviour should be as follows, formergeTB (dest, pos, src): pos == 1: Insertsrcbefore the start ofdest. pos == linesTB (dest): Insertsrcbefore the last line ofdest. pos == linesTB (dest) + 1: Appendsrcto the end ofdest. What should happen iftb1ortb2are empty? Both may be empty. Ifdestis empty then the only valid value forposis 1, which would causesrcto be appended to the end of the empty TB. pasteTB Can a textbuffer be pasted onto itself? Yes! For example, supposetbwas: 1 Never gonna give you up 2 Never gonna let you down Then after callingpasteTB (tb, 2, tb),tbwould look like: 1 Never gonna give you up 2 Never gonna give you up 3 Never gonna let you down 4 Never gonna let you down searchTB Can the search string have newlines in it? No. We will not test these cases. Can the search string be the empty string? Yes. In this case, return an empty list. How should I return an empty list? In general, this depends on the representation of the list. If the list is represented by a structure containing metadata about the list (such as its size, the pointer to the first node, etc.), like in the Week 1 and Week 2 labs, then an empty list is represented by a (pointer to a) metadata structure where the size field is set to 0, and the pointers to the first/last nodes are set toNULL. If the list is represented by a pointer to the first node, then an empty list is represented by aNULLpointer, as there are no nodes in the list. In this assignment, because a list of matches is merely represented by a pointer to the first match node, an empty list is represented byNULL. Can the search string be the NULL? No. In this case,abort(). Can the search string occur multiple times on the same line? Yes. In this case, the returned list of matches should have a node for each of the occurrences on that line. For example, if searchTB (tb, "bird")is called, andtbis: 1 A well a everybody's heard about the bird 2 B-b-b bird, bird, bird, b-bird's the word 3 A well a bird, bird, bird, the bird is the word 4 A well a bird, bird, bird, well the bird is the word 5 A well a bird, bird, bird, b-bird's the word The returned list should be: (1, 38) --> (2, 7) --> (2, 13) --> (2, 19) --> (2, 27) --> (3, 10) --> (3, 16) --> ... How do you handle the case where the search string is a repeated pattern (e.g., looking for 'abab' in 'ababab')? The matches you return should not overlap. After you find a match on a line, the search should resume fromafterthe part of the line that was matched. For example, if we searched for"abracadabra"in the string "abracadabracadabracadabracadabra", the matches are"abracadabracadabracadabracadabra". So if we searched for"abracadabra"in this textbuffer: 1 abracadabra alacazam 2 abracadabracadabracadabracadabra The returned list should be: (1, 1) --> (2, 1) --> (2, 15) --> X formRichText How should I handle cases where there are no characters between a pair of special characters (such as**)? In this case, nothing should happen. Only add the tags if there is at least 1 character being acted on. Can substitutions occur across lines? No. diffTB DoesdiffTBchange either of its textbuffer arguments? No.diffTBis non-destructive. How willdiffTBbe tested? There are many possible valid sequences of commands that could be returned fromdiffTB, so comparing the output against an expected output would not work. Instead, we will parse your edit solution and apply the commands one by one totb1. After applying the commands, we will calldumpTBontb1andtb2. If they return the same string, and your edit solution consists of fewer commands than the threshold, then you pass the test. Could we get an example? Certainly! Suppose that these aretb1(left) andtb2(right): 1 first line 1 first line 2 second line 2 2nd line 3 third line 3 third line 4 fourth line 4 quatre Here are some examples of correct command strings (there are others): "+,2,2nd line\n-,3\n+,4,quatre\n-,5\n" "-,2\n+,2,2nd line\n-,4\n+,4,quatre\n" "-,2\n-,3\n+,2,2nd line\n+,4,quatre\n" "-,4\n-,2\n+,3,quatre\n+,2,2nd line\n" undoTB and redoTB My implementation ofundoTBandredoTBrequires me to modify existing functions (e.g.,mergeTB), which affects their time complexities. Will I be penalised for having a slow time complexity for these functions? Your implementation of the applicable functions will probably be split into two parts: (1) one part that allowsundoTB andredoTBto work, and (2) a second part that actually does the main work of the function. In checking your functions' time complexities we will only consider the second part of the code. Should I record an operation that has no effect on the TB, such as merging an empty textbuffer? It's up to you - we won't be testing these cases. What should happen if I undo a merge? Istb2alive again? If you calledmergeTB (dest, pos, src),srcno longer exists, so callingundoTB (src)is invalid. CallingundoTB (dest)should simply remove the merged lines fromdest(of course, they may reappear again ifredoTB (dest)is called). What should happen if I undo a cut? If you calledcutTB (tb1, from, to), the returned TB (let's call ittb2) is completely independent oftb1. Calling undoTB (tb1)should restore the lines that were cut fromtb1, but have no effect ontb2. What should happen if I undo a paste? If you calledpasteTB (tb1, pos, tb2), the paste operation is performed ontb1, nottb2, sotb2has no record of the operation taking place. Thus, callingundoTB (tb1)should remove the pasted lines fromtb1(they may reappear if redoTB (tb1)is called), while callingundoTB (tb2)should do nothing unless some operation was performed ontb2 before the paste. Plagiarism This is an individual assignment. Each student will have to develop their own solution without help from other people. In particular, it is not permitted to exchange code or pseudocode. You are not allowed to use code developed by persons other than yourself. If you have questions about the assignment, ask your tutor. Plagiarismisdefined asusing the words or ideas of others and presenting them as your own. UNSW and CSE treat plagiarism as academic misconduct, which means that it carries penalties as severe as being excluded from further study at UNSW. There are several on-line sources to help you understand what plagiarism is and how it is dealt with at UNSW: Plagiarism and Academic Integrity UNSW Plagiarism Procedure Make sure that you read and understand these. Ignorance is not accepted as an excuse for plagiarism. In particular, you are also responsible that your assignment files are not accessible by anyone but you by setting the correct permissions in your CSE directory and code repository, if using. Note also that plagiarism includes paying or asking another person to do a piece of work for you and then submitting it as your own work. UNSW has an ongoing commitment to fostering a culture of learning informed by academic integrity. All UNSW staff and students have a responsibility to adhere to this principle of academic integrity. Plagiarism undermines academic integrity and is not tolerated at UNSW. Plagiarism at UNSW is defined as using the words or ideas of others and passing them off as your own. If you haven't done so yet, please take the time to read the full text of UNSW's policy regarding academic honesty and plagiarism The pages below describe the policies and procedures in more detail: Student Code Policy Student Misconduct Procedure Plagiarism Policy Statement Plagiarism Procedure You should also read the following page which describes your rights and responsibilities in the CSE context: Essential Advice for CSE Students